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Regardless  of  the  underlying  machine  architecture,  independently 
addressable  memory  modules  contribute  significantly  to  program  speed-ups  on 
modern  computers.  Because  of  memory  conflicts  which  arise  while  accessing 
data,  actual  program  speed-ups  are  generally  less  than  theoretically  possible. 
Organizing  the  data  of  a  computation  so  as  to  avoid  memory  conflicts  is 
particularly  difficult  for  data  which  can  logically  be  viewed  as 
two-dimensional.   Several  geometric  and  algebraic  conditions  are  presented 
which  determine  if  the  data  of  a  computation  can'be  organized  to  avoid 
memory  conflicts.   It  is  shown  that  a  prime  number  of  memory  modules  gives 
higher  memory  utilization  and  allows  the  use  of  simpler  storage  schemes 
than  a  power  of  two  number  of  memory  modules.   The  case  of  greatest  practical 
significance,  references  to  rows,  columns  and  diagonals  of  a  matrix,  is 
given  special  attention.   Finally,  a  brief  discussion  is  presented  which 
relates  this  research  to  that  of  a  companion  problem,  the  construction  of 
memory-processor  connection  networks  for  single-instruction-multiple-data 
stream  machines. 


UIUCDCS-R- 75-776 


THEORETICAL  LIMITATIONS  ON  THE  USE  OF 
PARALLEL  MEMORIES 


by 


Henry  David  Shapiro 

B.A. ,  Johns  Hopkins  University,  1968 
M.S.,  Stanford  University,  I969 


Department  of  Computer  Science 
University  of  Illinois  at  Urb ana- Champaign 
Urbana,  Illinois 


This  work  was  supported  in  part  by  the  National  Science  Foundation 
under  grant  no.  NSF  GJ  U1538  and  was  submitted  in  partial 
fulfillment  for  the  Doctor  of  Philosophy  degree  in  Computer  Science, 
1975. 


\LLior 
ho.77fc-7BI 

p.  3— 


in 


ACKNOWLEDGMENTS 

Special  thanks  are  due  to  the  chairman  of  my  doctoral  committee, 
Professor  C.  L.  Liu,  for  his  constant  encouragement  and  guidance  throughout 
this  research.  Appreciation  is  also  extended  to  the  other  members  of  the 
committee,  Professors  David  Kuck,  Duncan  Lawrie,  Judith  Liebman,  and 
Franco  Preparata,  for  their  many  constructive  suggestions.   Professor 
Preparata  deserves  a  special  note  of  gratitude  for  his  comments  which 
helped  to  simplify  the  proof  of  Theorem  5. 

Also,  sincere  appreciation  is  extended  to  two  fellow  graduate 
students,  Bruce  Link,  for  his  sustained  interest  in  this  research  as  well 
as  his  participation  in  discussions  of  these  results;  and  Brian  Hansche, 
for  our  discussions  of  the  conjectures  presented  in  Chapter  h. 

To  Mr.  Stanley  Zundo  of  the  Computer  Science  drafting  department 
goes  a  note  of  thanks  for  his  assistance  in  the  preparation  of  the  numerous 
figures  included  in  this  dissertation.   Thanks  also  goes  to  Mrs.  Connie  Slovak 
for  an  outstanding  job  in  the  typing  of  this  manuscript. 

Recognition  for  the  financial  support  which  made  this  research 
possible  is  due  to  both  a  National  Science  Foundation  Graduate  Fellowship 
and  NSF  Grant  GJ  ^1538. 

Finally,  a  note  of  thanks  to  my  wife,  Jacqueline,  who  more  than  anyone 
else,  encouraged  and  cheered  me  when  this  work  progressed  slowly. 


IV 


TABLE  OF  CONTENTS 

Page 

1.  THE  DATA  ORGANIZATION  PROBLEM 1 

1.1  Machine  Models  and  the  Data  Organization  Problem 1 

1.2  Formalization  of  the  Problem 7 

1. 3  Elimination  of  Boundary  Conditions 12 

1.^4-  Classes  of  Skewing  Schemes  and  Some  Particular 

Generalized  Lines 15 

1. 5  Summary 22 

2 .  DETERMINATION  OF  VALID  SKEWING  SCHEMES 2k 

2 . 1  Introduction 2k 

2.2  The  Basic  Result 2k 

2.3  Existence  and  Construction  of  Valid  Skewing  Schemes 
when  the  Number  of  Memory  Modules  Equals  the  Length 

of  the  Generalized  Line 32 

2.k     Existence  and  Construction  of  Valid  Periodic 

Skewing  Schemes kk 

3.  SPECIAL  RESULTS  ON  [x,y]  -LINES 52 

3. 1  Introduction 52 

3 . 2  Preliminaries 52 

3.3  The  Special  Case  of  a  Prime  Number  of  Memory  Modules 56 

3.  k     Generalization  to  Composite  N 60 

3  •  5  Further  Results  and  Examples 73 


Page 

k.      UNRESOLVED  PROBLEMS  AND  DIRECTIONS  OF  FURTHER  RESEARCH 80 

k.l     The  Effectiveness  of  Linear  and  Periodic  Skewing  Schemes....  80 

k.2     Questions  Relating  to  Memory  Utilization 93 

h.  3  Comments  on  Broader  Problems 97 

LIST  OF  REFERENCES 100 

VITA 102 


VI 


LIST  OF  FIGURES 

Figure  Page 

1  Multi-function  Computer 2 

2  Data  Needed  for  the  Evaluation  of  the  Function,  F 5 

3  Parallel  Computer 6 

k           Geometric  Realization  of  a  Generalized  Line 9 

5  The  Instance  of  the  [x,y]  -line  whose  First 

Component  is  (i,  j  ) 19 

6  Containment  Relations  Between  Classes  of  Skewing  Schemes...  23 

7  Instances  of  a  Generalized  Line,  with  their  Designated 
Elements  Marked  with  Asterisks 26 

8  Checking  the  Condition  in  Theorem  h 28 

9  Pictorial  Presentation  of  the  Proof  of  Theorem  k 30 

10  Pictorial  Presentation  of  the  Proof  of  Theorem  h 31 

11  Tesselation  of  the  Plane  by  a  Generalized  Line 33 

12  The  Generalized  Line  L  =  ( (0,  0),  (0,  l),  (0,2  ),  (l,  l),  (2,  l) ) 
Cannot  Tesselate  the  Plane 3^- 

13  The  Skewing  Scheme  Resulting  From  the  Use  of  Theorem  5 38 

lU     Tesselation  of  the  Plane  by  the  Generalized  Line, 

L  =  ((0,0),  (1,0),  (1,1),  (2,0),  (2,1)) 1+1 

15  Tesselation  of  the   Plane  by  the  Generalized  Line, 

L  =    ((0,0),  (1,0),  (2,0),  (2,1),  (2,2)) 1+2 

16  One  Possible  Tesselation  of  the  Plane  by  the 

Generalized  Line,  L  =  ( (0,  0),  (l, -l),  (l,  0),  (l,  l),  (2,  0) ) 1+3 

17  Another  Possible  Tesselation  of  the  Plane  by  the 
Generalized  Line,  L=  ( (0,0),  (l, -l),  (1,0),  (l,  l),  (2,  0)) 1+5 


VI 1 


Figure  Page 


18  The  "Wrap  Around"  Interpretation  of  a  Generalized 

Line  Used  with  Periodic  Skewing  Schemes k6 

19  A  Valid  Linear  Skewing  Scheme  for  the  Generalized 

Line,  L  =  ((0,0),  (0,1),  (0,2),  (l,l),  (2,0),  (2,1),  (2,2)) 50 

20  Proof  of  the  Existence  of  a  Periodic  Skewing  Scheme 

for  L  =  ((0,0),  (0,1),  (2,1),  (2,2)) 51 

21  [x, y ]  -lines  on  the  Torus 5^- 

22  Programmer  *  s  View  of  STAEAN  Memory 77 

23  The  Periodic  Skewing  Scheme  Used  in  the  STARAN  Computer....    78 

2k  Positioning  Four  Instances  of  the  Generalized  Line 

((0,0),  (1,0),  (1,1),  (2,0),  (2,1)),  so  their  Designated 

Elements  Form  a  Parallelogram 82 

25  Alternate  Positionings  of  Instances  for  the 

Generalized"  Line  ( (0,  0),  (l,  0),  (l,  1),  (1,2  ),  (2,2  ) ) Qk 

26  A  Non-periodic  Skewing  Scheme,  \|r,  for  which  cp(i,j)  = 

\|/(i  mod  N,  j  mod  N)  is  not  Valid 88 

27  Examples  of  Translating  by  (p, q)  and/ or  (r, s),  so 
that  all  the  Components  of  the  Instance  of  the 

Generalized  Line  Lie  Interior  to  the  Parallelogram 90 

28  Components  of  an  Instance  of  a  Generalized  Line, 
after  Translation  by  (p,q)  and/or  (r,  s),  which 

are  Stored  in  Same  Memory  Module 91 

29  An  Example  of  a  Polyomino  for  which  There  is  a 
Valid  Periodic  Skewing  Scheme,  but  no  Valid  Linear 

Skewing  Scheme 92 

30  Covers  for  the  Generalized  Line  ((0,0),  (0,1),  (0,2), 

(1,1),  (2,1))  which  Tesselate  the  Plane 96 


1.   THE  DATA  ORGANIZATION  PROBLEM 

1.1  Machine  Models  and  the  Data  Organization  Problem 

In  the  late  1950' s  and  early  1960's  computer  architects  began 
to  explore  the  possibility  of  increasing  the  speed  at  which  existing 
computers  operated,  by  performing  some  internal  operations  simultaneously. 
The  overlapping  of  memory  fetches  with  instruction  decoding  and  execution  was 
the  basis  of  the  increase  in  speed  of  several  machines.  A  difficulty 
imposed  by  the  hardware  technology  of  that  day  was  that  the  rate  at  which 
the  control  unit  and  arithmetic  processor  could  manipulate  data  was 
higher  than  the  rate  at  which  a  single  memory  unit  could  supply  the  data . 
Despite  many  changes  in  memory  and  circuit  technology  over  the  past  twenty 
years,  the  inability  of  a  single  memory  unit  to  satisfy  the  data  demands 
of  the  central  processor  has  not  changed.   It  appears  that  this  situation 
will  persist  in  the  foreseeable  future.   Because  of  the  relatively  slow 
memory  data  rate,  primary  memory  on  most  modern  computers  consists  of 
several  independent  memory  modules.   Since  memory  fetches  can  go  on 
simultaneously  in  different  memory  modules,  the  rate  at  which  the  memory 
can  supply  data  to  the  central  processor  is  effectively  increased.  A 
very  successful  machine,  designed  on  these  general  principles,  was  the 
CDC  6600  [15]  •  Figure  1  depicts  a  block  diagram  of  a  computer,  which  may 
be  regarded  as  an  abstraction  of  this  machine.   The  designers  of  the 
CDC  6600  realized  that  effective  use  of  the  potentially  highly  overlapped 
functioning  of  their  computer  depended  on  reducing  data  dependencies  in 
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Figure   1:      Multi- function  Computer 


computations  and  on  storing  the  data  requested  by  the  control  unit  so 
that  data  elements  demanded  in  quick  succession  were  in  different  memory 
modules.   The  problem  of  detecting  and  reducing  data  dependencies  in  a 
computation  has  received  much  serious  attention  in  the  literature.   The 
question  of  how  to  organize  the  data,  so  that  data  elements  needed  in 
quick  succession  by  the  central  processor  were  not  in  the  same  memory 
module,  was,  for  a  long  time,  generally  ignored.   The  designers  of  the 
CDC  6600  tried  to  lessen  contention  for  memory  by  arranging  primary  memory 
so  that  references  to  consecutively  numbered  memory  locations  cycled 
through  all  thirty- two  memory  modules  before  repeating.   This  scheme 
eliminates  memory  conflict  in  the  most  common  cases:   Fetching  sequential 
instructions  and  manipulating  the  data  of  one-dimensional  arrays  stored 
in  consecutive  memory  locations. 

The  problem  of  organizing  the  data  of  a  two-dimensional  array, 
so  that  data  requested  in  quick  succession  are  in  different  memory 
modules,  was  left  to  the  programmer  and/or  the  compiler.   To  make  this 
problem  more  explicit  consider  the  following  specific  example.  A  FORTRAN 
programmer  writes 

DIMENSION  A(N,  N) 
A(I,J)=F(A(I-1,J-1),A(I-1,J),A(I-1,J+1),A(I,J),A(I+1,J-1),A(I+1,J),A(I+1,J+1)) 

Normally  the  programmer  envisions  the  memory  allocated  to  array  A  as  actually 
being  two-dimensional,  leaving  it  to  the  compiler  to  convert  the  doubly 
subscripted  references  to  real  machine  addresses.   Fetching  the  parameters 


for  the  function  call  can  be  thought  of  as  fetching,  in  rapid  succession, 

in  Figure  2.   Depending  on  the 


the  data  enclosed  by  the 


dimensions  of  the  array  A,  and  the  method  of  data  organization  employed 
by  the  FORTRAN  compiler,  some  of  the  seven  parameters  needed  by  the 
function,  F,  may  lie  in  the  same  memory  module.   If  such  a  memory  conflict 
occurs,  the  fetching  of  the  data,  and  the  overall  computation,  will  be 
slowed.   In  general,  organizing  the  data  of  a  two-dimensional  array  so 
that  memory  conflicts  are  reduced  or  eliminated  is  a  very  difficult 
problem. 

Another  machine  design  in  which  this  same  type  of  problem 
arises  is  depicted  in  Figure  3-   This  is  a  single-instruction-multiple- 
data  stream  (SIMD)  machine,  an  abstraction  of  ILLIAC  IV.   In  many 
computations  the  goal  is  to  fetch  M  words  of  data  in  parallel  and  then 
operate  on  them  simultaneously.   If  even  two  of  the  data  words  to  be 
fetched  are  in  the  same  memory,  all  the  processors  may  have  to  sit  idle 
while  a  second  memory  cycle  is  initiated.   This  can  affect  performance 
dramatically.   Because  memory  conflicts  can  seriously  degrade  performance, 
in  machines  of  this  design,  organizing  the  data  of  a  computation  so  that 
memory  conflicts  are  avoided  can  be  very  important. 

The  purpose  of  this  thesis  is  to  develop  some  mathematical 
conditions  which  determine  if  the  data  of  a  two-dimensional  array  can  be 
stored  in  a  primary  memory  consisting  of  independent  memory  modules,  so 
that  during  a  given  computation  the  data  requested  by  the  control  unit  and /or 
arithmetic  processors  can  be  fetched  without  memory  conflicts.   In  Chapter  1 
preliminaries  are  considered.   Chapter  2  provides  a  general  discussion, 
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Figure  2:      Data  Needed  for  the  Evaluation 
of  the  Function,    F. 
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Figure  3 :   Parallel  Computer . 
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while  Chapter  3  focuses  attention  on  some  special  cases  of  importance 
in  practice.   Chapter  k   informally"  presents  some  techniques  which 
have  promise  in  practice,  even  though  complete  theoretical  analysis  of 
the  techniques  has  not  been  completed. 

1.2  Formalization  of  the  Problem 

In  Section  1.1,  the  problem  of  organizing  data  in  parallel 
memories  to  eliminate  memory  conflicts  was  developed  from  an  historical 
perspective.   To  treat  this  problem  mathematically  it  is  convenient  to 
provide  a  model  of  the  computations  that  abstract  the  situation  sufficiently 
so  that  machine  dependent  details  are  eliminated.   The  data  for  the 
computations  are  to  be  stored  in  a  doubly  subscripted  array. 

Definition  1:  A  generalized  line,  L,  of  length  n,  is 

an  n-tuple  of  ordered  pairs  of  integers,  the  first 

ordered  pair  of  which  is  (0,0). 

A  generalized  line  can  be  thought  of  as  a  rigid  template,-  which 
during  the  course  of  a  computation  is  positioned  at  various  locations  over 
the  matrix  of  data.   The  data  enclosed  by  the  template  is  to  be  fetched 
for  a  computation.   Returning  to  the  programming  example  used  in  Section  1.1, 


the 


is  to  be  viewed  as  the  template  and  its  positioning 


over  the  matrix  of  data  elements  is  determined  by  the  actual  values  of  I 
and  J  during  execution.   The  data  enclosed  by  the 


needs   to  be 


fetched  before   computation  of  the   function,    F,    can  proceed.      The  actual 
generalized  line   is   an  n-tuple,    for  example  L     =    ( (0, 0), (0,1), (0,2), (l, l), 
(2,0),  (2,1),  (2,2)) .      This   is  clearly  just  a   formal  way  of  specifying  a 


8 


geometric  template.   The  template  can  be  built  by  placing  unit  squares 
on  the  plane,  so  that  the  unit  squares  are  centered  at  the  points  of  the 
plane  indicated  by  the  ordered  pairs  of  the  n- tuple.   Figure  k 
demonstrates  this  construction  for  the  generalized  line  L  . 

There  are  a  few  minor  points  that  need  clarification.   First, 
the  labeling  of  the  points  of  the  plane  is  not  the  method  commonly  used 
in  elementary  algebra.   The  first  coordinate  indicates  the  vertical 
direction,  with  down  being  positive,  and  the  second  coordinate  indicates 
the  horizontal  direction,  with  right  being  positive.   This  labeling  was 
chosen  to  reinforce  the  fact  that  the  data  are  stored  in  a  two-dimensional 
array;  this  method  of  labeling  is  often  used  for  two-dimensional  arrays 
in  the  literature.   This  labeling  scheme  also  conforms  to  that  of  other 
authors  [3,10].  A  second  minor  point  is  that  technically  L  and  L  =  ((0,0), 
(-1,-1),  (1,-1),  (-1,0),  (1,0),  (-1,1),  (1,1))  are  different  generalized  lines. 
Their  realization  by  unit  squares,  however,  gives  the  same  geometric  shape. 
A  formal  definition  of  equivalent  generalized  lines  could  be  given; 
intuitively  two  generalized  lines  are  equivalent  if  they  realize  the  same 
shape,  without  rotations  or  reflections.  A  third  point  is  that  the 
geometric  realization  of  a  generalized  line  need  not  be  a  connected 
figure . 

As  has  been  pointed  out,  a  generalized  line  can  be  viewed  as  a 
template.   During  the  execution  of  a  program  this  template  will  be  positioned 
over  the  matrix  of  data,  and  the  data  elements  enclosed  by  it  will  be 
referenced  in  parallel  (or  in  quick  succession,  depending  on  the  nature 
of  the  machine) .   The  positioning  of  a  template  can  be  viewed  as  the 
intuitive  interpretation  of 
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Figure  k:     Geometric  Realization  of  a 
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Definition  2:  An  instance  of  a  generalized  line,  L, 
is  the  ordered  n- tuple,  L(a,b),  resulting  from  the 
addition  of  the  ordered  pair  (a,b)  to  each  component 
of  the  generalized  line. 


The  positioning  of  the 


in  Figure  2  corresponds  to 


L„(i,j)«   If  the  equivalent  generalized  line,  L  ,  (see  Figure  k)   is  used, 
then  this  same  collection  of  data  elements  corresponds  to  L  (i-1,  j-l).   It 
is  reasonable  to  consider  a  version  of  FORTRAN  designed  for  machines  with 
parallel  functioning.   The  program  segment  of  Section  1.1  might  become 


TEMPLATE  L=((0,0),  (0,1),  (0,2),  (l,l),  (2,0),  (2,1),  (2,2)) 
DIMENSION  A(N,N) 

A(I,J)=F(L(I-1,J-1)  OF  A) 

With  these  definitions  it  is  now  possible  to  give  a  precise 
statement  of  the  data  organization  problem. 

Problem:   Given  a  large  matrix  of  data  and  given  that 
an  algorithm  requires,  at  various  stages  in  its 
computation,  the  data  elements  contained  in  many 
instances  of  one  or  more  generalized  lines,  is  it 
possible  to  assign  the  data  elements  of  the  matrix 
to  various  memory  modules,  so  that  when  the  data 
elements  of  an  instance  of  a  generalized  line  are 
demanded,  all  the  data  elements  lie  in  different 
memory  modules? 
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The  ability  to  store  the  data,  so  that  when  an  instance  of  a 
generalized  line  is  desired  all  the  data  elements  are  in  different 
memories,  is  a  goal  in  keeping  with  the  functioning  of  machines 
designed  along  the  lines  of  Figures  1  and  5«   If  it  is  possible  to 
store  the  data  so  no  memory  conflicts  result  when  fetching  instances 
of  the  generalized  lines  used  by  the  algorithm,  then,  in  machines 
designed  along  the  lines  of  Figure  1,  contention  for  memory  is  lessened, 
and  in  machines  designed  along  the  lines  of  Figure  3,  the  data  can  be 
fetched  in  one  memory  cycle.   This  problem  motivates  the  following 
definitions . 

Definition  3:  Given  an  M  xM  matrix  and  N  independent 
memory  modules,  a  skewing  scheme  is  a  mapping, 
cp:  (0,1,2,  ...,M-1}  x  {0,1,2,  ...,M-1)  -  (0, 1,  2,  .  .  .,  N-l}, 
where  cp(i.j)  =  k  means  matrix  element  a.  .  is  stored 
in  memory  module  k. 

Definition  k:     Given  a  collection  of  generalized  lines, 
{Ln,Lp,  .  .  -,Lp)j  an  M  xM  matrix  of  data,  N  memory 
modules,  and  a  skewing  scheme,  cp,  the  skewing  scheme 
is  said  to  be  valid  for  this  collection  if  and  only  if 
given  any  instance  of  any  of  the  generalized  lines,  cp 
assigns  the  data  elements  of  the  instance  which  lie 
within  the  matrix  bounds  to  distinct  memory  modules. 

With  these  definitions  the  problem  described  earlier  can  be 
formalized  as:   Given  an  MxM  matrix,  N  independent  memory  modules,  and  a 
collection  of  generalized  lines  used  by  an  algorithm,  is  there  a  valid 
skewing  scheme  for  this  collection? 
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1.5  Elimination  of  Boundary  Conditions 

The  reader  may  have  noticed  in  the  preceding  section  that  the 
definition  of  a  skewing  scheme  explicitly  depends  on  M,  the  size  of 
the  matrix.   From  the  point  of  view  of  the  programmer  this  is  unfortunate. 
The  matrix  size  may  vary  from  one  run  to  the  next.   If  a  new  skewing 
scheme  was  needed  for  each  run,  use  of  some  programs  might  be  difficult. 
Notice,  however,  that  if  cp  is  a  valid  skewing  scheme  for  a  collection  of 
generalized  lines  on  an  MxM  matrix,  then  cp  restricted  to  [0,1,2,  ...,M'-1]  x 
{0,1,2,  . .  .,M'-1},  M'  <  M,  is  also  valid  for  this  collection  on  an  M*  x  M' 
matrix.   The  pragmatic  consideration  that  M  may  not  be  known  in  advance, 
and  can  be  large,  leads  to  the  search  for  valid  skewing  schemes  with 
domain  {0,1,2,...}  x  {0, 1,  2,  . . .)  .  When  skewing  schemes  with  this  domain 
are  used  several  benefits  accrue.  As  noted  above,  such  a  skewing  scheme 
can  be  used  without  prior  knowledge  about  the  size  of  M.  A  secondary 
benefit  is  that  special  case  handling  of  instances  which  overlap  the 
boundaries  along  the  right  and  bottom  of  the  MxM  matrix  can  be  simplified. 
Zero  or  some  other  null  value  can  be  stored  for  the  value  of  data  elements 
outside  the  actual  array  bounds.   These  practical  considerations  justify 
elimination  of  the  boundaries  along  the  right  and  bottom  edges  of  the 
matrix,  i.e.  there  are  reasons  to  treat  the  matrix  as  infinite  in  size. 
It  is  also  possible  to  show  mathematically  that  widening  the  domain  of 
skewing  schemes  to  the  quarter  plane  does  not  result  in  any  loss  of  generality, 
that  is,  those  collections  for  which  valid  skewing  schemes  exist  on  all 
finite  domains,  have  valid  skewing  schemes  on  the  quarter  plane. 
Theorem  1 :   Given  a  collection  of  generalized  lines, 
{L  ,  L  ,  ...,L  },  and  N,  the  number  of  memory  modules, 
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if  for  each  M  there  exists  a  valid  skewing  scheme 

for  this  collection,  cp  :  [0,1,2,  ..  .,M-1]  x 

(0,1,2,  ...,M-1]  -»  [0,1,2,  ...,U-1},  then  there 

exists  a  valid  skewing  scheme  for  this  collection 

with  domain  (0, 1, 2,  ... }  x  (0, 1, 2,  . . .}  . 

Proof:   The  proof  uses  the  Konig  infinity  lemma:   If  a  rooted 
tree  has  infinitely  many  nodes,  but  each  node  has  finitely  many  successors, 
then  there  is  a  path  of  infinite  length  in  the  tree  [9].  We  construct  a 
rooted  tree  as  follows .   The  nodes  at  level  i  in  the  tree  are  the  skewing 
schemes  valid  for  this  collection  for  an  i  xi  matrix.  Recall  that  for 
instances  which  overlap  the  boundaries  of  the  matrix  there  must  not  be 
any  memory  conflicts  for  elements  of  the  instance  lying  inside  the  matrix 
bounds.  Also  note  that  the  one  node  at  level  0,  the  root,  is  an  artificial 
construct,  since  matrices  of  dimension  zero  have  no  data  elements,  i.e. 
one  node  at  level  0  is  created,  by  convention,  to  provide  a  root  for  the 
tree.  A  node  at  level  i,  cp.,  is  connected  to  a  node  at  level  i+1,  cp.   , 
if  cp.  ,  restricted  to  {0, 1, 2,  . .  .,i-l]  x  {0, 1, 2,  . .  .,  i-1}  is  just  cp.  .   (The  node 
at  level  0  is  connected  to  all  nodes  at  level  1,  by  convention.)   This 
construction  produces  a  tree.   To  see  this,  note  that  every  node  at  level  i, 
i  >  0,  has  a  predecessor,  for  if  cp.  is  a  valid  skewing  scheme  on  the  i  x  i 
matrix,  then  cp.  restricted  to  [0, 1, 2,  . .  .,  i-2}  X  [0, 1, 2,  . . .,  i-2}  is  a  valid 
skewing  scheme  on  the  (i-l)  x  (i-l)  matrix.  Also  note  that  each  cp.  has 
only  one  predecessor.   Thus  the  construction  yields  a  tree.  Next,  notice 
that  the  tree  has  infinitely  many  nodes,  since  by  assumption  there  is  a 
valid  skewing  scheme  for  every  M,  and,  hence,  at  least  one  node  at  each 
level.   Lastly,  each  node  has  only  finitely  many  successors,  in  fact  a 
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node  at  level  i  has  exactly  N     possible  candidates  for  successor 

nodes,  many  of  which  will  presumably  fail  to  be  valid  skewing  schemes. 

Thus  the  Konig  infinity  lemma  implies  an  infinite  path  in  the  tree. 

Let  the  nodes  in  this  path  be  ty  ,\|/  ,\|/  ,  .  . .  .   Define  cp  by  cp(i,  j)  =  a|t  (i,  j) 

where  k  >  max(i,j).   Notice  that  cp  is  a  well-defined  function,  since  if 

k~  >  kn  then  \|/,   restricted  to  (0, 1,2,  . .  .,k  -1}  x  (0, 1, 2,  . .  .,k_, -1}  is  just 

d  1        K„  1  1 

\|r  ,  by  the  way  in  which  the  tree  was  constructed.   To  see  that  cp  is 

kl 
valid  for  the  collection  (L, , L p, . . . ,L p},  consider  an  arbitrary  instance  of 

one  of  the  generalized  lines.   Let  k  be  selected  sufficiently  large  so 

that  this  instance  does  not  overlap  the  right  or  bottom  boundaries  of 

the  k  xk  matrix.  Now  since  \h  is  valid  for  this  collection  on  the  k  xk 

matrix,  and  cp  restricted  to  (0,1,2,  ..  .,k-l)  x  (0, 1,2,  . .  .,k-l)  equals  \|r  , 

all  the  data  elements  comprising  the  instance  (except  those  which  overlap 

the  left  and  top  boundaries)  will  be  mapped  to  different  memories  by  \|/  , 

and  hence  cp.   Since  the  instance  was  arbitrary,  cp  is  a  valid  skewing 

scheme .  ■ 

Theorem  1  shows  that  in  searching  for  a  valid  skewing  scheme, 

M  can  be  ignored.  As  pointed  out  earlier,  an  additional  benefit  is  that 

special  handling  at  some  matrix  boundaries  is  eliminated.   To  extend  this 

simplification  to  the  left  and  top  boundaries  it  is  convenient  to  use 

(...,-1,0,1,...]  x  ( . . .,  -1, 0, 1,  . . .}  as  the  domain  for  skewing  schemes. 

Unlike  our  situation  earlier,  this  change  cannot  be  justified  on  the 

practical  grounds  that  the  size  of  the  matrix  may  not  be  known  in  advance. 

However,  any  difficulties  at  the  left  and  top  boundaries  can  be  eliminated 

by 

Theorem  2:   Given  a  collection  of  generalized  lines, 
(L  , L  ,  ...,L  },  and  N,  the  number  of  memory  modules, 
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if  there  exists  a  skewing  scheme,  cp,  valid  for  this 
collection  with  domain  (0,1,2,...)  x  (0, 1,  2,  .  . .),  then 
there  is  a  skewing  scheme,  cp,  valid  for  this  collection 
with  domain  ( .  . . ,  -1, 0, 1,  ... }  x  ( . . . ,  -1,  0,1,  .  . . )  . 
Proof:   The  proof  is  similar  to  the  preceding  proof,  so  only  a 
few  details  are  sketched.   The  main  difference  is  that  nodes  at  level  i 
are  chosen  to  represent  skewing  schemes  valid  for  the  collection  of 
generalized  lines  with  domain  {-i, -i+1,  . .  .,  i-1,  i)  x (-i, -i+1, . . ., i-1, i] . 
The  only  new  difficulty  is  to  see  that  level  i  is  not  empty.   This  is  so 
since  cp  restricted  to  (0, 1,2,  .  .  .,2i)  x  (0,1,2,  ..  .,2i)  is  valid,  so  \|/. 
defined  on  (-i, -i+1,  .  . .,  i-1,  i)  x  (-i, -i+1,  .  .  .,  i-1,  i)  by\J/.(j,k)  =  cp(j+i,k+i) 
is  also  valid.  B 

The  content  of  Theorem  2  is  that  there  is  no  loss  of  generality 
in  considering  only  matrices  of  data  that  are  infinite  in  all  directions. 
This  is  particularly  useful  in  formulating  theoretical  results,  since 
proofs  no  longer  need  to  account  for  any  special  conditions  that  might  arise 
when  only  part  of  an  instance  lies  inside  the  matrix  hounds.   Because  of 
Theorem  2  only  skewing  schemes  whose  domain  is  the  entire  plane  will  be 
considered  throughout  the  rest  of  this  thesis. 

l.k     Classes  of  Skewing  Schemes  and  Some  Particular  Generalized  Lines 

Given  a  collection  of  generalized  lines  it  is  desirable  to  have 
some  conditions  which  determine  whether  or  not  a  valid  skewing  scheme  exists 
In  situations  that  arise  in  actual  practice  the  existence  of  such  a  valid 
skewing  scheme  is  usually  not  sufficient.   It  is  also  highly  desirable  that 
cp(i,  j)  be  readily  calculable,  so  that  address  computation  does  not  overly 
degrade  system  performance.   There  are  two  approaches  that  can  be  taken. 
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One  is  that  cp  be  represented  by  a  simple  mathematical  formula.   The 
second  is  to  use  some  table  look-up  strategy. 

Neither  of  these  two  methods  of  calculating  cp  works  well  for 
arbitrary  skewing  schemes.   In  general,  a  closed  form  mathematical 
expression  for  an  arbitrary  skewing  scheme,  cp,  may  not  exist.   Table 
look-up  techniques  will  also  not  be  of  much  help,  since  a  large 
(theoretically  infinite)  table  will  have  to  be  stored,  and  storing  the 
table  in  memory  so  that  memory  conflicts  are  eliminated  in  obtaining 
information  from  the  table  is  the  same  problem  as  storing  the  original 
matrix  of  data  so  memory  conflicts  are  eliminated.   Because  arbitrary 
skewing  schemes  cannot  always  be  implemented  conveniently,  certain 
subclasses  of  the  skewing  schemes  valid  for  the  entire  plane  take  on 
significance.   Table  look-up  schemes  motivate  the  following  definition. 

Definition  5:  Given  N,  the  number  of  memory  modules, 

a  skewing  scheme,  cp,  is  called  periodic  if  and  only 

if  cp(i,j)  =  cp(i+kN,j+iN),  fork,!  =  ...,-2,-1,0,1,2,... 

and  for  any  i  and  j . 

If  cp  is  a  periodic  skewing  scheme,  then  cp(i,j)  =  cp(i  mod  N,  j  mod  l«l) 
Therefore  to  calculate  the  value  of  cp  at  any  point  in  the  plane  it  is  only 
necessary  to  know  cp  on  {0, 1,  . . .,  N-l]  x  [0, 1,2,  .  .  .,N-1]  .   If  N  is  sufficiently 
small,  periodic  skewing  schemes  can  be  implemented  by  table  look-up  at 
reasonable  cost.   The  needed  values  of  cp  can  be  stored  in  a  specially 
designed  super- fast  memory. 

As  N  becomes  large,  and  especially  in  machines  designed  along 
the  lines  suggested  by  Figure  3,  where  each  arithmetic  unit  may  require  a 
private  copy  of  the  basic  N  xN  storage  map,  periodic  skewing  schemes 
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realizable  only  by  table  look-up  become  unattractive.   In  such  situations 
the  first  method  suggested  for  computing  cp,  a  simple  mathematical 
formula,  appears  more  reasonable.  One  class  of  skewing  schemes  that 
has  attracted  attention  in  the  literature  is  the  class  of  linear  skewing 
schemes. 

Definition  6:   Given  N,  the  number  of  memory  modules, 
a  skewing  scheme,  cp,  is  called  linear  if  and  only  if 
there  exist  constants  a  and  b  such  that  cp(i,j)  =  ai+bj 
mod  N. 

The  class  of  linear  skewing  schemes  is  a  subclass  of  the  periodic 
skewing  schemes,  since  if  cp  is  a  linear  skewing  scheme,  then  cp(i+kN,  j+fN)  = 
a(i+kN)+b(j+£N)  mod  N  =  ai+tg  mod  N  =  cp(i,j),  and,  thus,  cp  is  a  periodic 
skewing  scheme.   That  the  linear  skewing  schemes  are  a  subclass  of  the 
periodic  skewing  schemes  shows  that  some  periodic  skewing  schemes  can  be 
implemented  without  table  look-up.   There  are  other  periodic  skewing  schemes 
which  can  also  be  efficiently  implemented  without  table  look-up.   The  one 
used  in  the  STARAN  computer  will  be  mentioned  in  Chapter  3. 

Budnik  and  Kuck  [3]  and  Lawrie  [10]  have  investigated  linear 
skewing  schemes  in  detail.   Much  of  their  work  was  motivated  by  considering 
machines  designed  as  in  Figure  3.  After  investigating  the  data  requirements 
of  programs  written  for  similar  machines,  and  after  discussions  with 
numerical  analysts,  they  generally  focused  their  attention  on  some 
commonly  used  generalized  lines.  In  particular  the  generalized  lines 
consisting  of  N  consecutive  elements  of  a  matrix  row  (the  generalized  line 
R=((0,0),  (0,1),  (0,2),  . .  .,  (0,W-1))  ),  N  consecutive  elements  of  a  matrix 
column  (the  generalized  line  C=(  (0,  0),  (1,0),  (2,  0),  . .  .,  (N-l,  0) )  ),  N 
consecutive  elements  of  a  forward  diagonal  (the  generalized  line 


18 

D=((0,0),  (l,l),  (2,2),  . .  .,  (N-l,N-l))  ),  N  consecutive  elements  of  a 
backwards  diagonal  (the  generalized  line  B=( (0, 0), (l, -l), (2, -2), . . ., 
(N-l, -N+l))  ),  and,  when  N  was  a  perfect  square,  n/n  x  */n  blocks  (the 
generalized  line  S=((0,0),  (0,1),  . .  .,  (0,n/n-1),  (1,0),  (1,1),  .  . .,  (1,n/n-1), 
. . .,  (\/n-1, 0),  (n/n-1,  1),  .  . .,  (*/n-1,\%-1))  )  were  of  primary  concern.  One 
of  the  main  results  of  Budnik  and  Kuck  [J]  is  that  if  2|w  or  3 |n,  where 
N  is  the  number  of  memories,  then  there  is  no  valid  linear  skewing 
scheme  for  the  collection  of  generalized  lines,  (R,  C,D,  B).   In  order  to 
generalize  this  result  and  to  provide  a  reasonable  notation,  a  definition 
is  useful. 

Definition  7:  An  [x, y]„-line  is  a  generalized  line 
of  the  form  ( (0,  0),  (y,  x),  (2y,2x),  . .  .,  ( (N-l)y,  (N-l)x) ) .+ 
Pictorially,  the  template  for  an  [x,y],T-line  is  formed  by  starting 
at  the  origin  and  going  over  x  and  down  y,  until  a  total  of  N  points  are 
generated  (see  Figure  5) •   In  this  notation  the  generalized  line 
representing  N  consecutive  elements  of  a  row  is  the  [1, 0]N-line,  the 
generalized  line  representing  N  consecutive  elements  of  a  column  is  the 
[0,1]  -line,  etc.   The  following  result  can  be  found  in  Budnik  and  Kuck  [3], 
though  different  notation  is  used. 

Theorem  3:  Given  N  memory  modules  and  a  collection  of 
[x,y]N- lines,  [[x^y^-lines  |i=l,2,  ...,I], 
cp(c,d)  =  ac  +  bd  mod  N  is  a  valid  linear  skewing 
scheme  for  this  collection  if  and  only  if 
(ay.+bx.,N)  =  1,  for  1=1,2,..., I. 


Note  that  N  is  the  length  of  the  [x,y]  -line. 
(c,  d)  is  the  greatest  common  divisor  of  c  and  d. 
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Figure  5:   The  Instance  of  the  [x, y]  -line  whose 


First  Component  is  (i,  j) 
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Proof:   Suppose  first  that  cp(i,j)  =  ai+bj  mod  N  is  a  valid 
skewing  scheme  for  bhis  col-lection.   Take  an  arbitrary  [x, y]  -line  from 
the  collection,  say  L  =  [x  ,y  ]  -line.   Since  cp  is  valid  the  N  elements 
in  the  instance  L(0, 0)  must  be  mapped  by  cp  to  different  memory  modules. 
Since  this  instance  is  just  ((0,0),  (y  ,x  ),...,( (N-l)y  ,  (N-l)x  )),  it 
must  be  the  case  that  cp(0,0)  /  cp(vy  ,  vx  ),  for  v=l,2,  . . .,  N-l.   Thus 
a-O+b-0  mod  N  /  avy  +bvx  mod  N,  for  V=l,2,  ...,N-1.   But  this  is  just 
v(ay  +bx  )  ^  0,  for  v=l,2,  .  ..,N-1,  and  it  is  a  well-known  result  of 
elementary  number  theory  that  this  implies  (ay  +bx  ,  N)  =  1  [7]« 

Conversely  suppose  there  exists  a  and  b  such  that  (ay  +bx  ,  N)  =  1, 
for  i=l,2,  ...,I.   To  show  that  cp(i,j)  =  ai+bj  mod  N  is  valid  for  the 
collection  consider  an  arbitrary  instance  of  an  arbitrary  [x,  y],T-line  in 
the  collection.   It  suffices  to  show  that  the  N  elements  in  this  instance 
are  mapped  to  different  memory  modules.   For  definiteness,  consider 
L  =  Tx  ,y  ]  -line  and  the  instance  L(i,j).   Since  the  choice  of  line  and 
the  choice  of  instance  were  made  arbitrarily,  if  cp  maps  all  the  components 
of  ((i,  j),  (i+yr,  j+xr), ...,  (i+(N-l)yr, j  +  (N-l)xr))  to  different  memory 
modules,  then  cp  will  be  valid.   But  if  qp(i+vy  ,  j+vx  )  =  cp(i+v'y  ,j+v'x  ), 
for  some  v,v'  e  (0, 1, 2,  .  .  .,N-l],  with  v  /  v',  then  ai+avy  +bj+bvx  = 
ai+av'y  +bj+bv'x  which  implies  (v-v')(ay  +bx  )  =„  0,  and  since  v-v'  ^  0, 
this  contradicts  the  assumption  that  (ay.+bx.,N)  =  1  for  i=l,  2,  ...,I.  ■ 

The  result  of  Budnik  and  Kuck  mentioned  earlier  follows 
immediately,  since  the  collection  of  generalized  lines  they  refer  to  is 
just  t[l,0]N- line,  [0,1]N- line,  [l,l]N-line,[l,-l]N- line},  and  one  of  a,  b,  and 
a+b  will  be  divisible  by  2,  so  if  2|n  no  choices  of  a  and  b  satisfy  the 
conditions  of  the  theorem.   Similarly  one  of  a,  b,  a+b,  and  -a+b  will  be 
divisible  by  3- 
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Lawrie  [10]  points  out  that  in  computers  designed  along  the 
lines  of  Figure  3  solving  the  data  organization  problem  is  insufficient 
in  practice.   To  be  able  to  use  such  a  computer  in  a  reasonable  way, 
an  efficient  memory-processor  connection  network,  which  can  route  the 

data  to  the  appropriate  arithmetic  unit,  is  also  needed.   In  providing 

2q 
a  data  organization  scheme  and  a  connection  network,  Lawrie  uses  P  =  2 

processors  and  N  =  2P  memories.  Within  this  framework  he  found  that  by 

use  of  linear  skewing  he  could  fetch  any  P  consecutive  elements  of  any 

row,  column,  forward  diagonal,  backward  diagonal,  or  any  vP  x  vP  block 

without  memory  conflicts.   (Take  a  =  s/P  +  1  and  b  =  2.)   This  does  not 

violate  Theorem  3>  since  the  number  of  memories,  N,  is  not  equal  to  the 

length  of  the  lines,  P.   In  addition  to  providing  a  skewing  scheme, 

Lawrie  also  designed  a  network,  the  fi -network,  which  routed  the  data 

to  the  appropriate  processor  in  0(%  P)  time.   F.  Yao  [l6]  has  shown 

that  the  fi- network  is  optimal.   Lawrie  left  unanswered  the  question  of 

using  some  non-linear  skewing  scheme  to  achieve  the  same  conflict  free 

access  while  at  the  same  time  reducing  the  number  of  memories,  N,  to  be 

2q 
equal  to  the  number  of  processors,  P  =  2   .   The  restriction  that  the 

number  of  processors  be  a  power  of  two  was  kept  so  that  any  arithmetic 

mod  N  could  be  performed  rapidly  by  shifting  and  by  the  hope  that  the 

fi-network,  or  some  slight  modification  of  it  would  still  be  able  to 

align  the  data.   In  Chapter  3  of  this  thesis  it  will  be  shown  that 

if  the  number  of  memories  equals  the  number  of  processors,  which  in  turn 

is  a  power  of  two,  then  no  skewing  scheme  of  any  type  will  be  valid  for 

the  collection  of  generalized  lines  Lawrie  considers. 


22 

Swanson  [Ik]   has  also  studied  the  problem  of  designing 
efficient  memory-processor' connection  networks.   In  his  construction 
the  number  of  memories,  N,  equals  the  number  of  processors,  P,  and  P 
is  prime.  For  this  case  he  has  designed  a  network  based  on  k-apart 
shifters  which  operates  in  0(n/p)  time,  but  uses  very  little  hardware. 
Unfortunately,  to  align  the  data  with  the  processors  his  network  must 
be  followed  by  a  shift  network,  which  requires  an  additional  0(<ty  P)  time 
and  0(%  P)  hardware.   The  choice  of  P  as  a  prime,  however,  guarantees 
that  any  instance  of  the  [1, 0]p-,  [0,l]p-,  [1,  l]p-,  and  [1,-1]  -lines 
can  be  fetched  without  conflict.   The  question  of  designing  memory- 
processor  connection  networks  will  be  discussed  again  briefly  in  Chapter  k. 

1 . 5  Summary 

In  this  chapter  several  computer  designs  which  utilize  parallel 
memory  modules  to  achieve  program  speed-ups  were  considered.   The  problem 
of  organizing  the  data  so  that  computations  can  proceed  efficiently  was 
formalized.   Generalized  lines,  and  as  special  cases,  [x,y]„- lines,  were 
defined.  Various  classes  of  skewing  schemes  were  also  defined.   Figure  6 
depicts  the  containment  relationships  between  these  classes.  Theorem  2 
shows  that  even  though  skewing  schemes  valid  on  the  entire  plane  are  a 
subclass  of  those  valid  on  the  quarter  plane,  the  two  classes  of  skewing 
schemes  are  of  equal  power,  in  that  the  collections  of  generalized  lines 
they  can  handle  are  exactly  the  same .   In  Chapters  3  and  k   similar 
results  will  be  presented  for  linear  and  periodic  skewing  schemes,  but 
only  for  certain  subclasses  of  generalized  lines. 
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cp:  {0,1,2,...}  x{0,l,2,...)  -  (0,1,2,  ...,N-1' 
Schemes  Valid  in  Quarter  Plane 


.,-1,0,1,  ...]  x(...,  -1,0,1,  ...}  -  10,1,2, 
Schemes  Valid  in  Whole  Plane 


•,.N-1] 


cp(i,j)  =  cp(i+kN,j+fN) 
k,ie{...,  -1,0,1=...) 

Valid  Periodic  Schemes 

cp(i,j)  =  ai+bj  mod  N 
Valid  Linear  Schemes 

■ 

Figure  6:   Containment  Relations  Between  Classes 
of  Skewing  Schemes. 
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2.   DETERMINATION  OF  VALID  SKEWING  SCHEMES 

2.1  Introduction 

Chapter  1  dealt  primarily  with  definitions  and  historical 
perspectives.   In  this  chapter  results  will  be  presented  on  whether  a 
matrix  of  data  can  be  stored  in  N  memory  modules  so  that  instances  of  a 
given  generalized  line,  or  collection  of  generalized  lines,  can  be 
fetched  without  memory  conflicts.   In  some  of  the  results  the  length  of 
the  generalized  line(s)  will  be  restricted  to  equal  the  number  of  memory 
modules.   This  restriction  is  motivated  by  the  desire  to  maximally 
utilize  memory  in  computers  designed  as  indicated  in  Figure  5«   If 
such  a  computer  has  M  arithmetic  units,  then  the  data  elements  of 
instances  of  generalized  lines  of  up  to  length  M  can  be  processed  in 

parallel.   If  the  number  of  memories  is  N,  then  memory  utilization  is 

M 
limited  to  — ,  because  only  M  out  of  the  N  memory  modules  will  be  referenced 

in  one  memory  cycle.   If  100$  memory  utilization  is  desired,  N  =  M  is 

required.   Since  the  memory-processor  connection  network  will  be  more  complex 

for  larger  N,  requiring  N  =  M  has  additional  benefits. 

In  the  last  part  of  the  chapter  special  attention  is  paid  to 

periodic  skewing  schemes.   The  results  developed  in  Sections  2.2  and  2.3 

will  be  seen  to  carry  over  to  the  periodic  case  without  significant  change. 

2.2  The  Basic  Result 

In  this  section  a  necessary  and  sufficient  condition  is  presented 
for  determining  the  validity  of  a  skewing  scheme  for  a  generalized  line  of 
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length  M,  using  N  memory  modules.   To  do  this  it  is  convenient  to 
augment  some  earlier  definitions. 

Definition  8:   The  designated  element  of  a  generalized 

line  or  an  instance  of  a  generalized  line  is  the  first 

ordered  pair  of  the  n- tuple. 

In  the  material  to  follow,  the  intuitive,  geometric  viewpoint 
is  a  valuable  aid  to  understanding,  and,  for  this  reason,  the  intuitive, 
as  well  as  the  formal,  approach  will  be  presented.  As  discussed  in 
Chapter  1,  a  generalized  line  can  be  viewed  as  a  rigid  template.   The 
designated  element  can  then  be  indicated  by  marking  the  appropriate 
square  of  the  generalized  line  or  an  instance  of  the  generalized  line  by 


a  distinguishing  mark,  like  an  asterisk.   For  the 


•shaped 


generalized  line,  given  by  L  =  ( (0, 0), (0, l), (0,2), (l,l), (2, 0), (2, l), (2,2) ), 

this  is  illustrated  in  Figure  rJ . 

Theorem  k:      Given  a  generalized  line  of  length  M,  N 

memory  modules,  and  a  skewing  scheme,  the  skewing 

scheme  is  valid  for  this  generalized  line  if  and 

only  if  the  following  condition  holds  for  every 

k  e  {0, 1,2,  .  .  .,N-1}  :  When  all  instances  of  the 

generalized  line  are  considered,  in  which  the 

designated  element  of  the  instance  is  mapped  by  the 

skewing  scheme  into  memory  k,  no  two  of  these 

instances  have  an  element  in  common. 

Before  presenting  the  formal  proof,  an  intuitive  explanation  of 

the  condition  presented  above  might  be  useful.   Imagine  a  two-dimensional 

memory  storage  map  laid  out  on  the  plane,  in  which  the  entry  at  (i,j)  is 
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MM) 


(0,0) 

(0,1) 

(1,0) 

* 

* 

l(; 


L  =  ((0,0),  (0,1),  (0,2),  (1,1),  (2,0),  (2,1),  (2,2)) 


Figure  "(:  Instances  of  a  '.Generalized  Line,  with 
their  Designated  Elements  Marked  with 
Asterisks. 
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the  value  of  the  skewing  scheme  at  this  point.  Also  assume  an  infinite 
supply  of  realizations  of  the  generalized  line  by  unit  squares,  with 
the  designated  element  marked  with  an  asterisk.   Select  a  k  e  {0,1,2,  ...,N-1}. 
For  each  location  in  the  memory  storage  map  containing  a  k,  place  a  copy 
of  the  generalized  on  the  plane  so  that  the  asterisk  is  directly  over  the 
k.   If,  after  completing  this  construction,  no  two  instances  of  the 
generalized  line  laid  down  on  the  plane,  overlap,  and  this  is  true 
irrespective  of  the  choice  of  k,  then  the  skewing  scheme  is  valid. 
Conversely,  if  the  skewing  scheme  is  valid,  then  when  the  construction 
described  above  is  performed  for  any  k,  the  instances  will  not  overlap. 
This  construction  is  illustrated  in  Figure  8. 

Proof  of  Theorem  k:      Suppose  first  that  for  some  k  e  {0,1,2,  ...,N-1} 
there  are  two  distinct  instances  of  the  generalized  line,  whose  designated 
elements  are  mapped  by  the  skewing  scheme,  cp,  into  memory  module  k,  and  that 
these  two  instances  have  an  element  in  common.   Being  specific,  let  the 
generalized  line  be  L  =  ( (x^y^,  (x2,yg),  .  .  .,  (x^y^  )+  and  let  the  two 
instances  be  L^b^  =  ( (a^x^b^y^,  (a^x^b^y^,  . .  .,  (a1+xM^b1+yM) ) 
and  L(a2,b2)  =  ( (a^x^b^y^,  (a2+x2,b2+y2),  .  .  .,  (a2+Vb2+y*P  ^  '   Then' 
the  two  instances  have  an  element  in  common  means  (a  +x.,b  +y.)  = 
(a  +x.,b  +y.),  for  some  i  and  j.   In  addition,  i  ^  j,  for  i  =  j  implies 
(a  ,b, )  =  (a  ,bp),  contrary  to  the  assumption  that  the  instances  are  distinct 
Since  the  designated  elements  of  these  two  instances  are  mapped  by  cp  into 

memory  k,  cP(a1+x1>'b1+y1)  =  <P(a2+xi>b2+yi)  =   k' 

In  order  to  see  that  cp  is  not  a  valid  skewing  scheme  for  L, 
consider  the  instance  L(a  +x  -x.,b  +y  -y.)  =  ( (a  +x  -x.+x  , b  +y  -y .+y  ), 

-L-L(JJ--1-,J  -LJ_(J_L_LJ_(J-L 

From  the  definition  of  generalized  line,  (x  ,y  )  =  (0,0). 
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L  =  ((0,0),  (0,1),  (0,2),  (1,1),  (2,1)).   N  =  7- 
The  condition  in  Theorem  4  is  tested  for 
k=3'   Overlap  occurs  in  two  places,  implying 
the  skewing  scheme  is  not  valid.  Only  a 
small  section  of  the  plane  is  shown. 


overlap 

occurs 

here 


Figure  8:   Checking  the  Condition  in  Theorem  4. 
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(a1+x1-xJ+x2,b1+yl-yJ+y2),  .  .  .,  (a-^-x..  +VVyi"y/yM) )  '  The  ^ 
component  of  L^+x^x  ,^+y^y  )  is  (a^x^x  +x  ,^+y^y  +y  )  = 

(a  +x  ,b  4-y  ),  which  is  mapped  by  cp  into  memory  k.   The  i"th  component 
of  L(a1+x1-x.,b1+y1-y.)  is  (a^-x.+x^b^-y .+y.)  =  (a^x.+x^x., 
h2+y.+y1-y,)  =  (a2+x1,b2+y1),  where  (a^x^b^y^  =  (a2+xj>b2+yj )  was 
used.   Since  cp(a  +x  ,b  4-y  )  =  k,  the  ith  component  of  L(a  +x  -x.,b  +y  -y.) 
is  also  mapped  into  memory  k.   Because  two  distinct  components  of  this 
instance  are  mapped  by  cp  into  memory  k,  cp  is  not  valid  for  L.   Figure  9 
provides  an  intuitive  picture  for  this  part  of  the  proof. 

Conversely,  suppose  cp  is  not  a  valid  skewing  scheme  for  the 
generalized  line,  L.   Then  there  is  an  instance  of  the  generalized  line, 
L(a,b)  =  ((a+x1,b+y1),  (a+x2,b4y2),  .  .  .,  (a+x^b+y^ ),  the  ith  and  jth 
components  of  which,  i  ^  j,  are  mapped  by  cp  into  memory  k.   Then  the  two 

instances  L(a+x.-xn  ,b+y  ,-y.,  )  and  L(a+x.  -x,  ,b+y.  -yn  )  violate  the  condition 
v   j   1'    j   1  l  1    i  1 

of  the  theorem.   Their  designated  elements  are  (a+x.-x  +x  , b+y.-y  +y  )  = 
(a+x  ,b+y  )  and  (a+x^x^x^b+y^y^y^  =  (a+x^b+y^,  respectively, 
which,  being  the  i™  and  j^h  components  of  L(a,b),  are  mapped  into  memory  k, 
Furthermore  these  two  instances  have  an  element  in  common,  since  the  i^h 
component  of  L(a+x.-x  , b+y.-y  )  and  the  jth  component  of  L(a+x. -x  ,b+y. -y  ) 

J     -I-         J     J-  X     X         X     X 

are  both  (a+x.+x. -x  , b+y . +y. -y  ) .   Figure  10  provides  an  intuitive  picture 

j    x    x       j    X    X 

of  the  situation.  ■ 

Several  remarks  are  appropriate  concerning  this  theorem.  When 
given  a  collection  of  several  generalized  lines,  this  theorem  can  still  be 
used  to  determine  the  validity  of  a  skewing  scheme  by  applying  it  to  each 
generalized  line  of  the  collection  individually,  since  the  skewing  scheme 
must  be  valid  for  each  generalized  line,  independently  of  its  validity  for 
the  others  in  the  collection.  A  second  consideration  is  that  this  theorem 
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stored  in  memory 
module  k  
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L(an,b 


v   r 


stored  in 
memory  — 
module  k 


L'aiiXrxj'bi+yryd] 


LCa^b  )  and  L(a2,li 

have  an  element  in 
common 


L(ao,b0 


L(a  +x  -x.,b  +y  -y . )  has  two  components  mapped 

-L     -L     <J     -i-     _L     ,J 

by  '{,'  to  the  same  memory.   Hence  cp  is  not  valid. 


Figure  9:   Pictorial  Presentation  of  the  Proof  of  'Iheorem  h 
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L(a,b) 


stored  in 
memory 
module  k 


L(a+xi-x1,b+yi-y1) 


L(a+x .-x  ,b+y.-y  )  and  L(a+x. -x  ,b+y. -y  )  have 

J-LJ-L  .L-LX-L 

their  designated  elements  stored  in  memory  module 
k  and  they  have  an  element  in  common. 


Figure  10:   Pictorial  Presentation  of  the  Proof  of 
Theorem  h . 
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is  non-constructive.  While  only  providing  a  method  for  establishing 
the  validity  of  a  skewing  scheme,  this  theorem  will  be  used  in  the  next 
section  to  establish  a  constructive  result. 

2.3  Existence  and  Construction  of  Valid  Skewing  Schemes  when  the 

Number  of  Memory  Modules  Equals  the  Length  of  the  Generalized  Line 

The  results  in  this  section  will  be  developed  first  for  one 

generalized  line,  and  then  extended  to  collections  of  generalized  lines. 

Definition  9:  A  generalized  line  tesselates  the 

plane  if  and  only  if  there  exists  a  (necessarily 

infinite)  collection  of  instances  of  the  generalized 

line,  so  that  every  ordered  pair  in  the  plane  is  in 

one  and  only  one  of  these  instances. 

If  the  realization  of  a  generalized  line  as  a  rigid  template 

composed  of  unit  squares  is  used,  then  a  generalized  line  tesselates  the 

plane  if  and  only  if  it  can  tile  the  "infinite  floor"  without  gaps  or 


overlapping.   The  generalized  line,  whose  realization  is 


-shaped, 


tesselates  the  plane,  as  can  be  seen  in  Figure  11,  but  the  generalized  line 
given  by  T  =  ((0,0),  (0,1),  (0,2),  (1,1),  (2,1)),  whose  realization  is 


-shaped,  cannot.   Figure  12  illustrates  why  the 


cannot 


tesselate  the  plane.   Notice  that  no  matter  where  the  first  instance  is 
laid  down  on  the  plane,  there  is  only  one  way  another  instance  can  be  laid 
down  so  that  the  square  labeled  A  is  covered  and  the  instances  do  not 
overlap.  Now  the  square  labeled  B  cannot  be  safely  covered. 


In  the  literature  [5,6]  tesselations  of  the  plane  normally  permit  rotations 
and  reflections  of  the  basic  shapes.  These  transformations  are  excluded  in 
this  discussion. 
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1 



1 

1 

II 

' 

figure  11:   Tesselation  of  the  Plane  by  a  Generalized  Line 


3^ 


A 

B 

Figure  12:   The  Generalized  Line  L  =  ( (0,0),  (0,1),  (0,2),  (1,1),  (2,1)) 
Cannot  Tesselate  the  Plane. 
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Theorem  5:  Given  N  memory  modules  and  a  generalized 
line  of  length  N,  there  is  a  valid  skewing  scheme  for 
this  generalized  line  if  and  only  if  it  tesselates 
the  plane. 

Proof:   Suppose  that  the  generalized  line  tesselates  the  plane. 
Define  cp  as  follows:   cp(i,j)  =  k,  where  (i,j)  is  the  k+ls^  component  of 
the  instance  contained  in  the  tesselation  which  covers  the  point  (i, j). 
cp  is  well  defined  since  each  point  in  the  plane  is  covered  exactly  once. 
The  question  is:   Given  any  instance,  does  cp  map  the  N  components  of  the 
instance  into  distinct  memory  modules?  The  reader  should  note  that  for 
cp  to  be  valid  the  answer  to  the  above  question  must  be  yes  for  all 
instances  of  the  generalized  line,  including  those  not  contained  in  the 
tesselation.   Theorem  k   will  be  applied  to  show  cp  is  valid. 

Consider  verification  of  the  condition  in  Theorem  k,    when  k  =  0. 
The  set  of  instances  that  must  be  checked  for  overlap  are  just  those  that 
comprise  the  given  tesselation,  by  the  way  cp  was  defined.   Since  the 
instances  used  in  a  tesselation  do  not  overlap  each  other,  the  condition 
is  verified  for  k  =  0.  Now  let  k  be  some  other  value  in  {0,1,2,  ...,N-1}. 
Note  that  every  element  of  the  plane  stored  in  memory  k  is  a  fixed  shift 
from  an  element  stored  in  memory  zero.   To  be  specific,  if  the  generalized 
line  is  L  =  ((x^y^),  (x,y  ),...,  (x,y)),  then  every  element  stored  in 
memory  k  is  at  (x,   -x  ,y   -y  )  away  from  an  element  stored  in  memory  zero. 
Conversely,  every  element  in  the  plane  (x,   -x  ,  y   -y  )  away  from  an 
element  stored  in  memory  zero  is  stored  in  memory  k.   Thus  when  verifying 
the  condition  of  Theorem  h   for  arbitrary  k,  the  set  of  instances  that  must 
be  checked  for  overlap  form  a  tesselation,  the  tesselation  formed  by  shifting 
the  given  tesselation  by  (x   -x  ,y   -y  ) .  Again,  since  the  instances 
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comprising  a  tesselation  do  not  overlap,  the  condition  is  seen  to  hold 
for  all  k,  and,  hence,  cp  is  -valid. 

In  order  to  prove  the  converse  suppose  cp  is  a  valid  skewing 
scheme  for  the  generalized  line,  L,  using  N  memory  modules.   Let 
T  =  {L(i,  j)  |cp(i,  j)  =  0],  i.e.  T  is  the  set  of  instances  whose  designated 
elements  are  mapped  by  cp  into  memory  zero.   The  claim  is  that  T  is  a 
tesselation.   The  condition  in  Theorem  h,    applied  when  k  =  0,  guarantees 
that  no  two  instances  of  T  overlap.   Thus,  to  prove  that  T  is  a  tesselation 
it  is  necessary  to  show  that  every  point  in  the  plane  is  a  component  of 
some  instance  contained  in  T.   Suppose  not,  i.e.  there  is  a  point  (c,d) 
such  that  (c, d)  is  not  a  component  of  any  instance  in  T.   Now  (c,d)  is  a 
component  of  precisely  N  instances  of  L,  since  there  is  exactly  one 
instance  of  the  generalized  line  in  which  it  is  the  h^"  component,  for 
h=l, 2,  ...,N.   Now  consider  the  N  distinct  designated  elements  of  these  N 
instances,   cp  cannot  map  any  of  these  N  designated  elements  into  memory 
zero,  for  it  was  assumed  that  (c,d)  was  not  a  component  of  any  instance 
in  T.   Thus  there  are  two  elements  in  this  set,  say  (a  ,b  )  and  (a  ,b  ), 
so  that  cp(a  , b  )  =  cp(a  , b  )  =  I   /  0,  by  the  pigeonhole  principle.  Now  the 
condition  of  Theorem  h   is  violated  when  k  =  i}    since  L(a  ,b  )  and  L(a  ,b  ) 
will  both  have  their  designated  elements  mapped  to  memory  i   and  they  both 
contain  (c, d).   This  contradicts  the  validity  of  cp.   This  contradiction 
arose  by  assuming  T  did  not  cover  (c,  d).   Thus  T  covers  every  point  in  the 
plane,  and  T  is  a  tesselation.  B 

Given  a  tesselation  of  the  plane  by  a  generalized  line,  the  proof 
of  the  theorem  shows  how  a  valid  skewing  scheme  can  be  constructed.   This 
theorem  provides  a  practical  means  for  determining  the  existence  of  a 
valid  skewing  scheme,  since  the  construction  of  a  tesselation  can  normally 
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be  done  by  observation.   Figure  13  indicates  the  skewing  scheme  resulting 

-shaped  generalized  line,  and  the  use  of  this 


from  the 


theorem.   The  dark  heavy  lines  indicate  that  in  this  case  the  skewing 

scheme  obtained  is  periodic.  More  will  be  said  about  periodic  skewing 

schemes  in  other  sections. 

The  result  of  Theorem  5  can  be  extended  to  collections  of 

generalized  lines . 

Theorem  6:  Given  N  memory  modules  and  a  collection 

of  generalized  lines,  {L, ,  Lp,  . .  .,L  },  all  of  length 

N,  then  there  is  a  valid  skewing  scheme  for  this 

collection  if  and  only  if  there  exist  tesselations 

of  the  plane,  T,  .  TL.  . . ..!_.  such  that  T.  is  a 
*  '      1  27    '  P7  l 

tesselation  using  L.  and  0.  =  CL  =  ...  =0   where 

l      12         p7 

0.  =  [designated  elements  of  the  instances  of  L. 
i  l 

comprising  T. } . 

Because  of  Theorem  5>    "the  condition  that  each  generalized  line 
tesselate  the  plane  is  clearly  needed.  An  intuitive  visualization  of  the 
condition  on  the  0.  makes  use  of  Theorem  6  easier,  and  may  aid  in 
understanding  the  proof.   Imagine  that  each  tesselation  of  the  plane,  T., 
is  performed  using  the  rigid  template  determined  by  the  generalized  line, 
L.,  and  that  each  tesselation  is  done  on  a  separate  sheet  of  clear 
plastic.   In  addition,  let  the  designated  elements  of  the  instances  be 
marked  by  asterisks.   Then  0.  is  the  set  of  points  on  the  copy  of  the 
plane  used  for  T.  containing  an  asterisk,  and  the  condition  that 
0=0,  ...  =  0  becomes:  When  the  sheets  of  clear  plastic  are  overlaid, 
the  asterisks  on  the  sheets  of  plastic  coincide. 
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L  =  ((0,0),  (0,1),  (0,2),  (1,1),  (2,0),  (2,1),  (2,2))  .   The 
heavy  lines  indicate  that  the  skewing  scheme  is  periodic 


Figure  13:   The  Skewing  Scheme  Resulting  From 
the  Use  of  Theorem  5- 
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Proof  of  Theorem  6:   Suppose  that  cp  is  valid  for  this 
collection.   Then  the  method  of  proof  used  in  Theorem  5,  applied  to 
each  generalized  line  separately,  produces  a  tesselation  T  using  L  . 
In  addition  the  tesselation  that  results  by  following  the  construction 
in  that  proof  yields  0y  =  { (i,  j)  |cp(i,  j)  =  0}.   Since  0V  is  independent 
of  V,  01  =  02=  ...  =  0p. 

To  establish  the  converse  suppose  tesselations,  T  ,  using  L,> 
exist,  and  0  =  0  =  ...  =  0  .   Then  define  cp  by  cp(i,j)  =  k,  where  (i,  j) 

J-  d.  sr 

is  the  k+ls^  component  of  the  instance  of  L  contained  in  T  .   This  is 

exactly  the  same  construction  employed  in  the  proof  of  Theorem  5,    and  so 

cp  is  well  defined  and  a  valid  skewing  scheme  for  L,  .   It  must  also  be 

shown  that  cp  is  a  valid  skewing  scheme  for  L„,L_,,  ....  and  L  .   Pick  an 
i  °  2  3  p 

arbitrary  generalized  line,  say  L,.   Theorem  k   can  be  applied  to  show  that 
cp  is  a  valid  skewing  scheme  for  L  .   In  verifying  that  the  condition  of 
Theorem  k   holds  when  k  =  0,  the  set  of  instances  that  must  be  examined  for 
common  elements  are  just  those  instances  comprising  the  tesselation,  T  . 
Now  consider  verifying  that  the  condition  of  Theorem  h   holds  for  arbitrary  k. 
As  in  the  proof  of  Theorem  k,    the  set  of  instances  that  must  be  examined  for 
common  elements  is  obtained  by  shifting  the  tesselation.  However,  the 
amount  of  the  shift,  (x   -x  ,y   -y  ),  is  determined  by  the  components  of 
L  ,  the  generalized  line  used  to  construct  cp,  despite  the  fact  that  the 
condition  is  being  verified  for  the  generalized  line,  L  .   Since  a  rigid 
translation  of  a  tesselation  is  a  tesselation,  and  v  and  k  were  arbitrary, 
the  theorem  is  proved.  ■ 

A  few  examples  may  be  helpful  in  illustrating  some  of  the 
subtleties  that  can  arise  in  using  this  theorem.   Consider  the  generalized 
lines  L  =  ( (0,0), (1,0), (l, l), (2, 0), (2, l) ),  whose  geometric  realization  is 


realization  is 


1+0 

,  and  L0  -  ((0,0), (1,  0),  (2,  0),  (2,  l), (2,2) ),  whose  geometric 
Each  of  these  generalized  lines  tesselates 


the  plane  separately,  as  Figures  ll+  and  15  show.   Since,  these  are  the 
only  tesselations  possible,  except  for  rigid  shifting,  it  is  clear  that 
tesselations  do  not  exist  for  these  two  generalized  lines  which  can  be 
positioned  so  their  designated  elements  coincide.   Thus  there  is  no  valid 
skewing  scheme  for  the  collection  {L,,L  }  using  only  five  memories. 

A  possible  question  arises  from  contemplating  this  example: 
If  another  generalized  line,  L',  which  produces  the  same  geometric 
realization  as  Ly,  is  substituted  for  L  in  the  collection  [L  ,  L  ,  ...,L  }, 
is  it  possible  a  valid  skewing  scheme  will  now  exist,  where  before  there 
was  no  valid  skewing  scheme?  The  answer  to  this  question  can  be  important 
in  practice,  since  it  is  convenient  to  think  in  geometric  terms. 
Fortunately,  the  substitution  described  above  does  not  affect  the 
existence  of  a  valid  skewing  scheme,  since  tesselations  of  the  plane 
using  L'  appear  to  the  eye  as  rigid  shifts  of  tesselations  of  the  plane 
using  L  .   Thus,  when  actually  using  this  theorem  to  find  valid  skewing 
schemes,  it  is  permissible  to  just  draw  pictures,  as  has  been  done  through- 
out, and  to  pick  the  designated  elements  arbitrarily. 

Figures  Ik   and  l6  illustrate  another  situation.   Both  the 
generalized  lines  L, ,  given  earlier,  and  L,  =  ((0,0),  (1, -l),  (1,0),  (1,1), 


(2,0)),  whose  geometric  realization  is 


,  tesselate  the  plane, 


and  it  is  clear  that  when  these  tesselations  are  overlaid,  their  designated 
elements  coincide.   Thus  Theorem  6  guarantees  a  skewing  scheme  using  five 
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Figure  Ik:      Tesselation  of  the  Plane  by  the  Generalized 
Line,  L=  ( (0,0),  (1,0),  (1, 1),  (2,0),  (2,1) )  . 
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Generalized  Line,  L  =  ((0,0),  (1,-1),  (1,0), 
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memory  modules  that  allows  conflict  free  access  to  all  instances  of 
either  generalized  line.   However,  there  is  an  alternative  tesselation 
for  Lv,  given  in  Figure  17.   If  this  tesselation  had  been  used,  along  with 
the  only  tesselation  for  L  ,  then  the  incorrect  conclusion,  that  there 
is  no  valid  skewing  scheme  for  the  collection  {L  ,L  },  might  have  been 
drawn.   The  statement  of  Theorem  6  only  requires  the  existence  of  a  set 
of  tesselations  which  also  satisfy  an  additional  condition.   Thus  if  more 
than  one  tesselation  exists  for  some  of  the  generalized  lines  in  the 
collection,  they  must  all  be  tried  before  concluding  that  no  valid  skewing 
scheme  exists. 

2.1+  Existence  and  Construction  of  Valid  Periodic  Skewing  Schemes 

As  was  pointed  out  in  Chapter  1,  periodic  skewing  schemes  are 
a  valuable  subset  of  all  skewing  schemes,  since,  for  restricted  values  of 
N,  a  reasonable  amount  of  additional  hardware  permits  address  computation 
by  table  look-up.   For  this  reason  it  would  be  nice  if  the  theorems  in 
the  last  two  sections  could  be  restricted  to  determine  the  existence  of 
valid  periodic  skewing  schemes. 

The  essential  idea  of  the  needed  alteration  is  depicted  in 
Figures  13  and  18.   Consider  the  memory  storage  map,  infinite  in  both 
directions,  defined  by  a  periodic  skewing  scheme.   If  the  plane  is 
partitioned  into  NxN  squares,  where  N  is  the  number  of  memory  modules, 
then  the  memory  map  defined  by  the  skewing  scheme  is  identical  within 
each  partition.   The  bold  lines  in  Figure  13  illustrate  this.   The  main 
point  to  be  observed  is  that  when  an  instance  of  a  generalized  line  extends 
over  one  of  the  partitioning  lines,  instead  of  considering  the  instance  to 
be  as  in  Figure  18(a),  it  can  be  considered  to  be  as  in  Figure  l8(b) . 
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Figure  l'J:     Another  Possible  Tesselation  of  the  Plane  by 
the  Generalized  Line,  L=  ( (0,0), (1, -1), (1,0), 
(1,1),  (2,0)). 
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Figure  18:      The   "Wrap  Around"   Interpretation  of  a 
Generalized  Line  Used  with   Periodic 
Skewing  Schemes. 
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This  view  is  appropriate  in  determining  the  validity  of  a  periodic 
skewing  scheme,  since  in  Figure  18(h)  the  data  elements  of  the  instance 
that  are  on  the  left  are  stored  in  the  same  memory  modules  as  the  data 
elements  of  the  instance  that  extend  beyond  the  partition  line  in 
Figure  l8(a) .   This  situation  has  been  referred  to  as  "wrapping  around." 
This  can  occur  over  horizontal  partitioning  lines  as  well.   Since  there 
are  no  special  properties  used  when  an  instance  of  a  generalized  line 
wraps  around,  the  opposite  edges  of  the  N  xN  square  can  be  identified, 
resulting  in  a  torus.   The  entire  problem  of  finding  valid  periodic 
skewing  schemes  can  be  recast  into  the  framework  of  looking  for  valid 
skewing"  schemes  on  the  torus  formed  by  identifying  opposite  edges  of  the 
N  xN  square. 

Definition  9,    and  Theorems  k,    5>  and  6  carry  over  to  the  torus 
with  only  minor  modification. 

Theorem  7:   Given  a  generalized  line  of  length  M, 
N  memory  modules,  and  a  periodic  skewing  scheme,  the 
skewing  scheme  is  valid  for  this  generalized  line  if 
and  only  if  the  following  condition  holds  for  every 
k  e  (0, 1,2,  .  .  .,N-1}  :  When  all  instances  of  the 
generalized  line  on  the  torus  (formed  by  identifying 
opposite  edges  of  the  Nxlf  square)  are  considered,  in 
which  the  designated  element  of  the  instance  is  mapped 
into  memory  k,  no  two  of  the  instances  have  an  element 
in  common. 

Definition  10:  A  generalized  line  tesselates  the  torus 
(formed  by  identifying  the  opposite  edges  of  an  NxN 
square)  if  and  only  if  there  exists  a  collection  of 
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instances  of  the  generalized  line,  so  that  every 
ordered  pair  on  the  torus  is  in  one  and  only  one 
of  these  instances. 


Theorem  8:   Given  N  memory  modules  and  a  generalized 
line  of  length  N,  there  is  a  valid  periodic  skewing 
scheme  for  this  generalized  line  if  and  only  if  it 
tesselates  the  torus  (formed  by  identifying  opposite 
edges  of  the  NxN  square). 

Theorem  9:   Given  N  memory  modules  and  a  collection 

of  generalized  lines,  (L  ,L  ,  ...,L  },  all  of  length 

N,  then  there  is  a  valid  periodic  skewing  scheme  for 

this  collection  if  and  only  if  there  exists  tesselations 

of  the  torus  (formed  by  identifying  opposite  edges  of 

the  NxN  square),  T  ,T  ,  ...,T  ,  such  that  T.  is  a 

tesselation  using  L.  and  CL  =  0.  =  .  .  .  =  0  .    where 

i      12         P 

0.  =  [designated  elements  of  the  instances  of  L. 
comprising  T. } . 


The  proofs  of  these  theorems  are  identical  to  those  of 
Theorems  k,    5,    and  6,  only  the  arithmetic  must  be  done  in  the  residue 
classes  mod  N.  When  drawing  pictures  to  determine  the  existence  of  valid 
periodic  skewing  schemes  only  an  N  xN  square  is  needed  as  long  as  instances 
extending  beyond  the  bounds  of  the  square  are  wrapped  around. 

A  few  additional  comments  are  in  order  before  closing  this 
section.   In  working  on  the  torus,  as  in  the  plane,  the  realization  of  a 
generalized  line,  as  a  rigid  template  composed  of  unit  squares,  need  not 


k9 


An  interesting  question  which  can  be  asked  is:   Can  the 
situation  "be  restricted  still  further  to  determine  the  existence  of 
valid  linear  skewing  schemes?  The  answer  to  this  question  is  not 
fully  known.   Notice  that  the  skewing  scheme  given  in  Figure  13  is 
periodic,  "but  not  linear.   However,  a  valid  linear  skewing  scheme  does 
exist  for  the  generalized  line  whose  geometric  realization  is 


■shaped.   One  such  is  illustrated  in  Figure  19.   Only  the 


N  xN  square  is  shown,  since  linear  skewing  schemes  are  periodic,  and 
thus  only  the  N  xN  square  with  wrap  around  need  be  considered.   Other 
generalized  lines,  like  L,  =  ( (0,  0),  (0,  l),  (2,1),  (2,2) ),  whose  geometric 


realization  is  the  disconnected  shape  ,  provide  examples  of 


generalized  lines  for  which  valid  periodic  skewing  schemes  exist,  but  for 
which  no  valid  linear  skewing  schemes  exist.   Figure  20  implies  the 
.existence  of  a  valid  periodic  skewing  scheme.   Trying  all  possibilities 
for  a  and  b  in  cp(i,  j)  =  ai+bj  mod  N,  where,  since  N  =  h,    a  and  b  need  only 
run  over  0,  1,  2,  and  3,  eliminates  the  existence  of  valid  linear  skewing 
schemes.   The  general  question  of  when  a  valid  periodic  skewing  scheme  for 
a  collection  of  generalized  lines  implies  a  valid  linear  skewing  scheme 
appears  quite  difficult.   Chapter  3  investigates  this  question  for 
[x^yLy-lines .   The  general  problem  is  discussed  again  in  Chapter  K. 
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Figure  19:  A  Valid  Linear  Skewing  Scheme  for  the 
generalized  Line,  L  =  ( (0, 0), (0, l) , 
(0,2),  (1,1),  (2,0),  (2,1),  (2,2)). 
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Figure  20:   Proof  of  the  Existence  of  a  Periodic 
Skewing  Scheme  for  L  =  ((0,0),  (0,1), 

(2,1),  (2,2)). 
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3.      SPECIAL  RESULTS  ON   [x,y]N- LINES 


3-1  Introduction 

In  Chapter  2,  geometric  conditions  were  developed  which  aid 
in  determining  if  valid  skewing  schemes  exist  for  collections  of 
generalized  lines.   In  this  chapter,  only  collections  of  [x,y]  -lines 
are  considered.   The  highly  structured  nature  of  [x,  y]  -lines  permits 
additional  results  to  be  obtained.   The  main  result,  which  will  be  proved 
over  the  course  of  the  next  several  sections,  is 

Theorem  10:  Given  N  memory  modules  and  a  collection 
of  [x,y]N-lines,  ([x^y^-lines  |i=l,2,  . .  .,1),  then 
there  is  a  valid  periodic  skewing  scheme  for  the 
collection  if  and  only  if  there  is  a  valid  linear 
skewing  scheme  for  the  collection. 

The  if  direction  is  trivial.   The  only  if  direction,  however,  is 
a  rather  surprising  result,  since  the  number  of  valid  periodic  skewing 
schemes  for  a  collection  of  generalized  lines  usually  greatly  exceeds  the 
number  of  valid  linear  skewing  schemes.   The  proof  technique  is  a 
generalization  of  an  argument  used  by  Polya  for  a  restricted  subcase  [13] • 

3.2  Preliminaries 

As  was  discussed  in  Chapter  2,  when  dealing  with  periodic 
skewing  schemes  it  is  convenient  to  replace  the  plane  with  the  torus 
formed  by  identifying  opposite  edges  of  the  NxN  square.  An  instance  of 
an  [x,y]  -line  can  now  be  viewed  as  having  its  first  component  at  (i,j) 
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and  with  successive  components  located  by  going  over  x  and  down  y  on 
the  torus,  until  a  total  of  N  points  are  generated.   Figure  21  illustrates 
this  construction.   It  might  happen  that  in  generating  these  N  points 
two  of  them  will  coincide.   If  this  should  occur,  then  there  can  be  no 
valid  periodic  skewing  scheme  using  N  memory  modules  for  this  [x, y]  -line, 
since  there  are  two  distinct  matrix  elements,  in  the  same  instance  of 
this  [x, y]  -line,  which  must  be  assigned  to  the  same  memory  module  by  any 
periodic  skewing  scheme.   The  condition  that  must  be  imposed  is  simple. 

Lemma  1:   If  (x.,y.,N)  f   1,  then  there  is  no  valid 

periodic  skewing  scheme  using  N  memory  modules  for 

the  generalized  line,  L,  the  [x. ,y. ]  -line. 

Proof:   Suppose  (x.,y.,N)  =  s  >  1.   Then  L(c,d)  =  ((c,d), 
(c+y  ,d+x  ),  ...,  (cA.,d+i  ),  ...,  (c+(N-l)y  ,d+(N-l)x  ) )  .  Note  that 

XX  o  X      o  X  X  X 

^      ^  A      N  "4.  A      4-V   4-   N  ^  KT  S    -    T 4>  T.-4- 

—  ,  —  ,  and  —  are  integers  and  that  —  <   N-l.   If  cp  is  an  arbitrary 

s    '      s    '  s  s 

periodic  skewing   scheme  then  cp(c,d)    =  cp(c+N — ,d+N — )   =  cp(c  +  — y. ,  d +— x. )  • 

S         S  S   1      S  X 

Since  (c,d)  and  (c  +  — y.,d  +  -x. )  are  components  of  the  same  instance  of  L, 

S   X      S   X 

cp  is  not  valid.  ■ 

Thus,  given  a  collection  of  [x,y]  -lines  {[x.,y. ]  -lines | 
i=l,2,  .  . .,  I},  if  even  one  of  the  [x.  ,y.  ]  -lines  is  such  that  (x.,y.,N)  ^  1, 
then  there  are  no  valid  periodic  skewing  schemes,  using  N  memory  modules, 
for  this  collection.   (x.,y.,N)  =  1  for  i=l,2,  ...,I,  is,  therefore  a 
necessary  condition  for  the  existence  of  a  valid  periodic  skewing  scheme. 
It  is  not  a  sufficient  condition,  however. 

When  (x.,y.,N)  =  1  some  simplifications  are  possible.   In 
general,  given  an  arbitrary  generalized  line,  L,  of  length  N,  there  are 
N     distinct  instances  of  L  on  the  torus.  However,  letting  L  be  the 


t/ 
\xi^y-^N)is  the  greatest  common  divisor  of  x-,  y.  and  R. 
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An  example  of  an  instance  of  an  [x, y]  -line 

on  the  torus.   The  order  in  which  the  elements 
were  generated  is  indicated  on  the  figure. 


i  =  1,   j=2,   x  =  3,        y  =  2, 


N 


Figure  21:   [x,y]  -lines  on  the  Torus 
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[x.,y. ]  -line,  on  the  torus  the  components  of  L(c,d),  L(c+y. ,d+x. ), 
. ..,  and  L(c+(N-l)y.  ,d+(N-l)x. )  are  the  same.   The  ordering  of  the 
components  within  these  instances  is  different,  but  for  purposes  of 
determining  the  validity  of  a  periodic  skewing  scheme  these  N  instances 
can  be  regarded  as  one  instance.   Thus,  for  [x, y]  -lines,  with  (x, y, N)  =  1, 
the  number  of  instances  on  the  torus  is  effectively  N,  instead  of  Or. 
Throughout  the  rest  of  this  chapter  an  [x,y]  -line,  with  (x,y,N)  =  1, 
will  be  regarded  as  having  only  N  instances  on  the  torus.  Notice  that 
no  two  of  the  N  instances  have  any  elements  in  common  and  since  the  torus 
has  IF  elements,  every  element  on  the  torus  is  in  one  and  only  one  instance 
It  is  possible  to  characterize  these  N  instances. 

Lemma  2:   Given  an  [x,y]  -line,  with  (x,y,  N)  =  1, 
each  of  the  N  instances  of  the  [x,  y]  -line  can  be 
characterized  by  an  integer  in  {0, 1, 2,  . . .,  N-l),  in 
the  following  manner:   If  (w, z)  is  a  component  of 
an  instance  of  the  [x, y]  -line,  characterize  this 
instance  by  xw-yz  mod  N. 

Proof:   The  proposed  characterization  is  a  function  from  the  N 
instances  of  the  [x,y]  -line  to  (0,1,2,  ...,N-1}.   First  it  must  be  shown 
that  this  is  indeed  a  well-defined  function.   Let  (w, z)  and  (w',z')  be 
different  components  of  the  same  instance  of  the  [x, y]  -line.   Then 
w'  =  w+vy  and  z1  =z+vx,  and  xw'-yz'  = xw+xvy-yz-yvx  =xw-yz,  so,  in 
fact,  the  mapping  defined  is  a  function. 

Additionally,  the  function  is  a  one-to-one  correspondence.   The 
pigeonhole  principle  implies  that  to  see  this  it  suffices  to  show  that 
for  any  i  there  exists  w  and  z  such  that  xw-yz  =i.   It  is  a  well-known 


Congruences  are  mod  N,  unless  otherwise  indicated. 
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result  in  number  theory  that  given  x  and  y  there  exists  c  and  d  such 
that  xc-yd  =  (x,y)  [7].   Since  ((x,y),N)  =  (x,y,N)  =  1  and  the  residue 
classes  of  numbers  relatively  prime  to  N  form  a  group  under  multiplication, 
there  exists  g  such  that  (x,y)  'g=l.  Hence  xcg-ydg  =  1,  and  thus 
xcgi-ydgi=  i.   Letting  w  =  cgi  mod  N  and  z  =  dgi  mod  N  gives  the  needed 
w  and  z .  B 

3.3  The  Special  Case  of  a  Prime  Number  of  Memory  Modules 

Theorem  10  is  easy  to  prove  if  N,  the  number  of  memory  modules, 

is  a  prime. 

Lemma  3:   Given  N  memory  modules,  N  a  prime  number, 

and  a  collection  of  [x.y]  - lines,  ([x. ,y. 1  - lines  I 

;"  N      7     1   1  N 

i=l,2, ...,I},  then  there  is  a  valid  periodic  skewing 
scheme  for  the  collection  if  and  only  if  there  is  a 
valid  linear  skewing  scheme  for  the  collection. 
Proof:  As  remarked  earlier,  the  if  direction  is  trivial.  -  If  cp 
is  a  valid  periodic  skewing  scheme  then  exactly  N  points  on  the  torus, 
formed  by  identifying  opposite  edges  of  the  N  xN  square,  will  be  mapped 
by  cp  into  memory  module  zero.   This  is  so  because  exactly  one  component 
of  each  of  the  N  disjoint  instances  of  the  [x  , y  ]  -line  must  be  mapped 
to  zero  by  cp.  Without  loss  of  generality,  the  element  (0,0)  can  be  assumed 
to  be  one  of  these  elements.   Consider  any  other  element  mapped  into 
memory  zero,  say  the  element  (y, x).   Construct  the  a  and  b  for  the  linear 
skewing  scheme  as  follows: 

if  y  s  0  then  a  =  1,  b  =  0 

else  if  x  =  0  then  a  =  0,  b  =  1 

else  b  =  1,  a  =  -y  x  , 
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/N— 1  /S 

where  y  "  is  the  multiplicative  inverse  of  y  in  the  field  of  residue 
classes  mod  N.   (Note  that  y  ^  0  and  N  is  prime.) 

The  claim  is  that  cp(c,  d)  =  ac+bd  mod  N  is  a  valid  linear 
skewing  scheme.   Suppose  not,  i.e.  cp(c,d)  =  cp(c+vy.,  d+vx. )  for  some 
c,d,  ie[l,2,  ...,!},    and  Ve{l,2,  . .  .,N-1} .  This  implies  avy.+bvx.  =0, 
but  since  N  is  a  prime  and  v  f-   0,  this  last  equation  implies  that 
ay.+bx.  =  0.   Three  cases  will  be  examined  to  show  that  av.+bx.  =  0 
contradicts  the  validity  of  cp. 

Case  1:  y  =  0.   ay.+bx.  =  0  reduces  to  y.  =  0,  since  a  =  1 

11  i    ' 

and  b  =  0.   Since,  by  Lemma  1,  (x.,y.,N)  =  1,  x.  ^  0.   It  follows  that 

x7  exists  in  the  field  of  residue  classes  mod  N,  since  N  is  prime  and 

x.  is  non-zero.   Thus  (y,x)  =  (0,x)  =  (O.xxT  x. )  =  (0+xx7  y.  ,0+xxT' x. )  . 
l  '       '       7      l  l    N    l  i'    i  iy 

Now  (0,0)  and  (0+xx.  y.,0+xx.  x. )  are  distinct  components  of  the  same 

instance  of  the  [x.,y. ]  -line,  because  xx.   is  non-zero.   They  are  both, 

li  N     '  i  ' 

however,  mapped  by  cp  to  memory  zero,  contradicting  the  validity  of  cp. 

Case  2:   x  =  0.   ay.+bx.  =  0  reduces  to  x.  =  0.   Here  y.  £   0, 

11  1  Jl    ' 

and,  in  a  manner  similar  to  case  1,  (y, x)  =  (y, 0)  =  (yy7  y.,0)  = 

(0+yy7  y.,0+yy7  x. ) .   This  is  a  similar  situation  to  that  encountered 

in  case  1. 

Case  3:   x  ^  0,  y  ^  0.   ay.+bx.  =  0  becomes  -y~  xy.+x.  =  0. 

Now  y.  ^  0,  for  y.  =  0  implies  x.  =  0  and  then  (x.,y.,N)  ^  1,  contrary  to 

the  requirement  established  by  Lemma  1.  But  y.  p  0  implies  y7  exists 

in  the  field  of  residue  classes  mod  N.   Thus  x  =  yy7  x.  and 

l  l 

(y,x)  -  (yyT  yi,yy7  x±)    =   (0+yy~  y^O+yy"  x±),    giving  rise  to  the  same 
contradiction  as  in  case  1. 

Thus,  the  assumption  that  cp(c,  d)  =  ac+bd  mod  N  is  not  a  valid 
linear  skewing  scheme  led  to  the  conclusion  that  cp  is  not  a  valid  periodic 
skewing  scheme,  contrary  to  the  hypothesis  of  the  lemma.  ■ 
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It  is  now  possible  to  precisely  characterize  when  valid 
periodic  skewing  schemes  exist  for  a  collection  of  [x,y]  -lines,  if 
the  number  of  memory  modules  is  a  prime. 

Lemma  k:      Given  N  memory  modules,  N  a  prime  number, 
and  a  collection  of  [x,y]  -lines,  {[x. ,y. ]  -lines | 
i=l, 2, . . ., I],  then  there  exist  a  valid  periodic 
skewing  scheme  for  this  collection  if  and  only  if 
(x.,y.,N)  =  1,  for  i=l,2,  ...,I,  and  either  I  O  or 
for  all  non-zero  choices  for  g.,  i=l,2,  .  .  .,N+1  and  any 
permutation,  a,  of  (l,  2,  ...,l),  it  is  not  the  case 
that  (xa(l),ya(l))  -  (0,gl),  (xa(2),ya(2))  -  (g2,g2), 

••"'  (xa(i)>ya(i)}  E  tei^-D'Si)'  "•> 

(Xa(N)'ya(W))  B  (%(*-!),%),  and  (xa(N+l)'ya(N+l) }  s 

(gN+1,o). 

Proof:   Lemma  J  shows  that  it  is  sufficient  to  prove  that  the 
conditions  stated  above  are  precisely  those  needed  to  guarantee  the 
existence  of  valid  linear  skewing  schemes  for  the  collection  of  [x,y]. 


Tl 


lines.   The  need  for  the  condition  (x.,y.,N)  =  1  for  i=l,2,  ...,I  has 


already  been  discussed. 

The  only  if  direction:  Suppose  I  ^  N  +1  and  there  exists  non- 
zero g.  and  a  permutation,  a,  so  that  (x  /,x,y  /-,n)  =  (0,g  ),  . .  ., 
i  o{±)      o{±)                  l 

(Xa(N)^a(N)}  3  ^N^W'  and  (Xa(N+l)'yc(N+l) }  *   (%+l>0)  '  Zt   Wl11 
be  shown  that  for  any  a  and  b,  cp(c, d)  =  ac+bd  mod  N  is  not  a  valid 

skewing  scheme. 

Case  1:  b  =  0.  ayo(N+1)+bXo(N+1)  ^  a  -0  +0  •  g^  *  0.  Thus 

(ay    /        s+bx    /        yN)   4  ~L>    and->    ^y  Theorem  3,    cp  is  not   a  valid  skewing 
scheme. 
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Case  2:  b  ^  0.   Because  N  is  a  prime  number,  (ay.+bx.,N)  =  1 

for  all  i  if  and  only  if  ay. +bx.  ^  0  for  all  i.  Also  ay.+bx.  ^  0  for  all 

11  11 

i  if  and  only  if  ab~  y.+x.    jk  0  for  all  i.      Now  ab~      =  j    for  some 

J6{Q,1,2,  ...,H-1}.      Let  k  =  -J.      Then  a*"Vff(k+l)«fl(k+l)    s  ab"\+i+^+ik  : 

jgk+l"jgk+l  =  °>   where   (xa(k+l)'ya(k+l))   s  ^k+l^^+l5  was  used'     Thus 
ay.+bx.    =  0  for   some  i  and  Theorem  3   implies  cp  is  not  valid. 

The   if  direction:      (x.,y.,N)   =  1  for  i=l,2,  ...,  I  and  suppose, 

for  the  moment,    I  §  N  +1,    but   for  all  non-zero  choices  for  g.    and  any 

permutation,    a,    of   (l, 2,  ...,l),    it  is  not  the   case  that    (x    /    v.y   ,.*)    = 

av-LJ      a(.l; 

(0,gl),    ...,    (xa(N),ya(N))    -  (gN(N-l),gN),    and   {*g{lsMLy7g{v+l))    =  (%+1,0) 

A  valid  linear  skewing  scheme  will  be  defined  for  the  collection  of 

[x,y]N-lines. 

Case  1:   Suppose  that  there  exists  je{0, 1, 2,  . . .,N-1},  such  that 

(x.,y.)  ^  (g.  -,j,g.,1)  for  all  i  and  all  non-zero  choices  of  g.   ,  i.e. 
l  l      j  +1   j  +i  j  +1 

the  reason  a  permutation  cannot  be  constructed  with  (x  ,  ^y  ,  ,)    =   (0, g  ), 

— '  (Xa(N)^ya(N))  "  (%(*-!)*%)'  *nd  (xa(N+l)'ya(N+l) }  s  (gN+l>0)  ls 
that  there  is  no  possible  choice  for  a(j+l).   (Note  that  when  attempting 

to  construct  permutations  for  which  (x  /-,\>y  /-,\)  -  (0>g-,),  •••> 
(Xa(N)^a(N)}  S   ^(^W'  and  ^+1y7a^+l))    =  C^^O),  (x.,y.) 
can  be  congruent  to  only  one  of  (0,g1),  (g2,g2),  ...,  (gN(N-l),gN),  and 
(gN+1, 0).)   Take  a  =  -j  and  b  =  1.   Now  if  (ay  +bx_,  N)  ^  1  for  some  k, 
then  ay  +bx,  =  0,  since  N  is  prime,  and  hence  -jy, +x,  =  0.   Then 

^Vyk^  ~  (J'yk>yk^  contrary  to  the  assumption  that  (x±,y±)    £   (g.+1J,g.+1) 
for  any  i  and  any  non-zero  g.  ,  since  g.   can  be  taken  to  be  y  .  Hence, 
(ay.+bx.,N)  =  1  for  all  i  and  cp(c, d)  =  ac+bd  mod  N  is  valid. 
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Case  2:   For  any  i  and  any  non-zero  gN+1,  (x^y^)  ^  (%+]_>  °)' 
This  is  similar  to  case  1, 'and  occurs  when  there  is  no  possible  choice 
for  a(N+l)  for  which  (xff(l),ya(l))  =  (0,^),  ...,  (xff(N),yff(]|))  s 

(gN(N-l),gN),  and  (xa(N+l),ya(N+l))  ■  (gN+1,0).   Take  a  =  1  and  b  =  0. 
Again,  suppose  (ay  +bx  , N)  /  1  for  some  k,  then  ay  +bx,  =  0  reduces  to 
yk  s  0,  and  (x^y^  =  (\>°)    ~   (%+1'°)  where  gN+1  is  taken  as  xfc.   This 
contradicts  the  assumption  that  (x.,y.)  ^  (g   ,  0)  for  all  i  and  all  non- 
zero choices  of  g   .   Thus  cp(c,  d)  =  ac+bd  mod  N  is  valid. 

Since  these  two  cases  exhaust  all  the  reasons  why  a  permutation, 
a,  cannot  be  constructed  so  that  for  some  non-zero  choices  of  g. 

(Xa(l)^ya(l))  3<°'8i>'  •'•>    (xa(N)^ya(N))  S  (s^W'  and 
^Xa(N+l),ya(N+l)')  ~  ^N+l'0^  there  is  a  valld  linear  skewing  scheme 
for  the  collection  of  [x,y]  -lines. 

If  I  <   N  the  pigeonhole  principle  gives  the  same  two  cases,  since 
there  will  either  exist  a  je{0, 1,  . . .,  N-l]  such  that  for  any  ie[l,2,  . .  .,  1} 
and  any  non-zero  choice  of  g.   ,  (x.,y.)  ^  (g.  -,j,g.  , )  or  for  any 
ie{l,2,  ...,1}  and  any  non-zero  choice  of  gN+1,  (x±,yi)  ^  (gN+1,0).  ■ 

3-4  Generalization  to  Composite  N 

To  complete  the  proof  of  Theorem  10  it  suffices  to  prove  two 
statements: 

1.   Given  N,  the  number  of  memory  modules,  N  a 

composite  number,  a  collection  of  [x, y]  -lines, 
C  =  ([x.,y.  ]  -lines  |i=l,2,  . . .,  I],  and  a  valid 
periodic  skewing  scheme,  cp,  for  this  collection, 
using  N  memory  modules,  then  if  p  is  a  prime 
factor  of  N,  there  is  a  valid  periodic  skewing 
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scheme,  cp',  using  only  p  memory  modules,  for 

the  collection  of  [x,  y]  -lines,  C  =  {[x.,y. ]  - 

'  p     '        x3    1  p 

lines  |i=l,  2,  . . .,  1}  . 
2.   Given  a  collection  of  [x,y]  -lines,  S  =  ([x.,y.]  - 
lines |i=l, 2,  . . .,  I],  and  a  valid  linear  skewing 
scheme,  cp,  for  S,  using  M  memory  modules,  and  given 
a  valid  linear  skewing  scheme,  cp',  for 
S'  =  ([xi,yi]MT-lines|i=l,2,  ...,I},  using  M' 
memory  modules,  then  there  is  a  valid  linear 
skewing  scheme,  cp",  using  MM'  memory  modules,  for 
the  collection  of  [x^]^, -lines,  S"  =  {[x^y^^,- 
lines |i=l, 2, . . ., 1}  . 
To  see  that  this  is  sufficient  note  that  given  a  valid  periodic 
skewing  scheme,  using  N  memory  modules,  for  the  collection  of  [x, y]  -lines, 
{[x.  ,y.  ]  -lines  |i=l,2,  .  . .,  I},  statement  1  above,  implies  valid  periodic 
skewing  schemes  for  each  prime  factor  of  N.   Lemma  3  then  implies  the 
existence  of  valid  linear  skewing  schemes  for  this  collection  for  each  prime 
factor  of  N,  and,  finally,  statement  2  implies  existence  of  a  valid  linear 
skewing  scheme  for  this  collection  using  N  memories.   Because  of  Theorem  3> 
statement  2  is  equivalent  to 

Lemma  5:   Given  a  collection  of  ordered  pairs, 
t(xi,yi)  |i=l,2,  ..  .,1},  and  given  M,  M1,  a,  b,  a', 
and  b',  such  that  (ay.+bx.,M)  =  (a'y  +b'x,M')  =  1, 
for  i=l,2,  ...,I,  then  there  exists  a"  and  b"  such 
that  (a"y.+b"x  ,MM')  =  1,  for  i=l,2,  ...,I. 


Note  that  the  xj_  and  y±   are  the  same  in  both  C  and  C,  but  the  generalized 
lines  are  different,  since  they  have  different  length. 
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Proof:   The  proof  is  constructive.   Let  jt  be  the  product  of 
those  prime  numbers  which  divide  M,  but  do  not  divide  M' .   Let  jt'  be 
the  product  of  those  prime  numbers  which  divide  M',  but  do  not  divide 
M.   Finally,  let  p  be  the  product  of  those  primes  which  divide  both  M  and 
M' .   For  the  sake  of  definiteness,  in  calculating  it,  jt'  and  p  include  a 
prime  factor  in  the  product  only  once,  even  if  it  appears  in  the  prime 
factorization  of  M  and/or  M'  several  times.  Also  if  there  are  no  prime 
factors  from  which  to  calculate  it,  it'  or  p  set  tt,  tt'  or  p,  as  the  case 
dictates,  to  one. 

Define  a"  =  ait'p+a'Tt  and  b"  =  bjr'p+b'n.   The  claim  is  that 
(a"y.+b"x.,MM')  =  1  for  i=l,2,  ...,I.   Suppose  not,  i.e.  for  some  i,  there 
is  a  prime,  p,  for  which  p  |  a"y.  +b"x.  and  p|MM'.  If  p|a"y.+b"x., 
then  p|ajt 'py.+a'ny.+bot 'px.+b'nx.  .   By  the  definitions  of  it,    Tt  *  and  p, 
p  divides  exactly  one  of  them.   Assume  p|xt.   Since,  p|n  implies 
p|a'Tty.+b'Ttx. ,  it  can  be  concluded  that  p|art  'py.+bit  'px.  .   Since  p  is  a  prime 
and  p  Jn'  and  p  /p,  it  must  happen  that  p|ay.+bx..   However,  p  J  jt  implies 
p|M,  so  (ay.+bx.,M)  4  ^-t    contrary  to  the  hypothesis  of  the  theorem.   Thus 
if  p|a"y.+b"x.  and  p|MM',  then  p  /it.   By  similar  arguments  p  /it'  and  p  /p. 
But  this  is  impossible.   Hence,  a"y.+b"x.  and  MM'  have  no  prime  factors 
in  common,  i.e.  (a"y.  +b"x.  ,MM' )  =  1  for  i=l,2,  ...,I.  ■ 

To  complete  the  proof  of  Theorem  10,  it  is,  therefore,  sufficient 
to  prove  statement  1,  above.   Statement  1  will  be  proved  by  contradiction. 
To  this  end,  suppose  a  collection  of  [x,y]  -lines,  C  =  {[x.,y. ]  -lines| 
i=l,2, .. .,1),  is  given,  p  is  a  prime  number,  there  is  no  valid  periodic 
skewing  scheme  for  C,  using  p  memory  modules,  and  there  is  a  valid  periodic 


skewing  scheme,  using  N  memory  modules  for  the  collection  of  [x,  y]  - 


lines,  C  =  {[x.  ,y.  ]  -lines  |i=l, 2,  . .  .,  I},  where  p|N. 

Some  additional  technical  lemmas  are  useful. 
Lemma  6 : 


N 
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P-l 
o 

o 


l 


0 

-1 
-1 


(p-l) 


-1 


1 


0 
-2 

-2 


(P-D 


-2 


0 
(p-2) 

(P-2) 


1 


1 
-(p-D 

>-(P"l) 


(p-1)-^-2^     (p-l)"^ 


1 
2 


0   ip"2   2P"2 

0    I5"1    2p-X 


1 
(P-D 


(P-D 
(P-D 


p-2 
P-l 


=  (P-1)I, 


where  all  calculations  are  done  in  the  field  of  residue 

classes  mod  p,  p  a  prime  number. 

Proof:   For  convenience  the  second  matrix  will  be  denoted  by  M 

P 

and  the  first  by  M' .   Consider  the  first  row  of  the  product  matrix. 

Clearly,  the  first  row  of  M'  times  the  first  column  of  M  is  p-l.   The 

P  P 

first  row  of  M'  times  any  other  column  of  M  is  zero,  since  it  is 
P  P 

(p-l) •l+0-i+. . .+0'iP~  +l'iP"  "  =  (p-l)+iP~  .   Fermat's  theorem  [7]  states 
that,  if  i  /  0,f  iP   =  1.  Thus  p-l+ip_1  -  p-1+1  =  p  =  0.  Thus  the  first 


Note  that  all  equations  are  in  the  field  of  residue  classes  mod  p,  in 
particular,  =  mean  =  . 


Q± 


row  of  the  product  matrix  is  [p-1  0  0   ...   0] .   Now  consider  any  other 
row  of  M',  [0  i  "   i"'   ...   i      ],  times  any  column  of  M  , 
[1  j  j2  ...   jP_1]T.   The  product  is  0+i'1j+i"2j"2+...+i-(p-l)jp-1  = 
(i  j)+(i  j)  +...+(i  o)   • 

In  general  (x+x  +...+xp  ') (x-l)  =  x-x,  but  since  p  is  prime, 
Fermat's  theorem  implies  x  =  x  for  any  x.   Thus  (x+x  +...+X   ) (x-l)  =  0. 
But  in  a  field,  the  product  of  two  numbers  is  zero  if  and  only  if  at 
least  one  of  them  is  zero.   Thus  if  x  /  1,  x+x  +  ...+xP~~  =  0,  and,  clearly, 
if  x  =  1,  x+x2+...+xp~  =  p-1.   Replacing  x  by  i-1.i,  and  noting  i-1j  -  1  if  and 
only  if  i  =  j,  gives  that  the  off-diagonal  elements  of  the  product 

matrix  are  zero  and  the  diagonal  elements  are  p-1.   Thus  M'  xM  =  (p-l)l.  ■ 

p   p 

Corollary  1:   det(M  )  ^  0,  where  the  calculations 
are  performed  in  the  field  of  residue  classes  mod  p, 
p  a  prime  number. 

Proof:   (p-l)M'  is  a  left  inverse  for  M  ,  since  (p-l) 

2 
p  -2p+l  =  1.  But  if  a  matrix  has  a  left  inverse,  the  left  inverse  is  a 

two-sided  inverse,  and  the  matrix  has  a  non-zero  determinant  [8].  ■ 

p-1  r     rp-1  if  r=  p-1 
Corollary  2 :   Z  i  =   S  , 

i=0     P  Lo   if  r  -  0,1,2,  ...,p-2 

where  0  =  1,  by  convention. 

Proof:  While,  in  general,  matrix  multiplication  does  not 

commute,  M'  xM  =  M  xM',  since  M'  is  (almost)  the  inverse  of  M  .   Note 
P   P    P   P        P  P 

that  the  last  column  of  M'  is  all  ones,  since  i~^p"   =  (i   )p" L  =  l,  by 

p-1 
Fermat's  theorem.   Thus  (  2  i  )  mod  p  is  just  the  r+1   row  of  M  times 

i=0  P 

the  last  column  of  M' .   Since  M  xM'  =  (p-l) I,  the  last  column  of  the 

P         P   P 

T 
product  matrix  is  [0  0   ...   0  p-1]  ,  and  the  corollary  is  proved.  ■ 
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P  -1  r      f-P    if  r  =  p-1 
Lemma  7:   z  i  =    -l  ,  for  e  a  l.    (l) 

i=0     P    lo     if  r  =  0,1,  .  ..,p-2 

Proof:   The  proof  is  by  induction  on  e.   For  the  basis  case, 

p-1  r     r-p  =  -1  =  p-1  if  r  =  p-1 
e  =1,  formula  (l)  reduces  to  Z  i  =  < 

i=0     P  lo  if  r  =0,1,2,  .  ..,p-2 

This  is  just  Corollary  2. 

Therefore,  assume  that  the  result  is  true  for  e'  =  e-1.   Since 

formula  (l)  is  clearly  true  for  r  =  0,  r  will  be  assumed  greater  than  zero 

throughout  the  remainder  of  the  proof.  Now, 

pe-l      pe_1-l  p-1  p-1  pe_1-l 

Z  lT   =    S     Z  (jp+i)r  =  Z     Z   (jp+i)r  •  (2) 

,2=0        j=0   i=0  i=0   j=0 


Expanding  (jp+i)  by  the  binomial  expansion  and  rearranging  the 
terms  in  the  sum  gives 

e  e-1 

„r    „   I      "  /T\ .r-k  r-k.k  /vN 

Z  t     =  Z   Z     Z   (,)o   P   i   •  (3) 

1=0  k=0  i=0   j=0 


Isolate  an  inner  sum, 


e-1  -,  e-1  •] 

p   "  /rN  .r-k  r-k.k  ,rN  r-k.k  p  „   .r-k              ,,  x 

S  L)o   p   i  =  (,  )p   l  Z   o                 (4) 

j=o  k  k  j=0 


By  the  induction  assumption, 


e-1  ,  e-2  .        . 

P   -1  r  k  f-P  if  r-k  =  p-1 

S   J'  %-l   I 

0=0  P      LO  if  r-k  =  0,1,  ...,p-2 
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Thus, 


P  „  "\r-k 

£  a 


e"2       e~l  -^   -, 

•P   +ar  kP    if  r-k  =  p-1 


(5) 


ar,kP 


e-1 


if  r-k  =  0,1,  .  .  .,p-2 


for  some  a   ,  integer.   By  substituting  (5)  into  (k)   it  is  possible  to 
r ,  K 

determine  the  value  of  the  sums  in  (3). 

Case  1:   Middle  terms  of  the  binomial  expansion,  r  >  k  >  0. 
In  this  case  1  g  r-k  <  p-1.   Thus 


e-1 

,rN  r-k.k  p   "  .r-k 
,V)P   i     £   J 

k         3=0 


,z\  r-k.k     e-1  _   . 
(k)P   l  a^kP     =pe  0 


(6) 


p-1  p-1  , 

Thus  for  k=l,2,  ...,r-l,   £     £   ©J   P   i  =   °- 

i=0   j=0   k  pe 


Case  2:   The  first  term  of  the  binomial  expansion,  r  >  k  =  0, 
Here,  again  substituting  (5)  into  (k) , 


v  v  P6"1-!    v       P6"1-! 
,T\   r-k.k  ^  „    .r-k    r        .r 

(V)P    i     Z        J  -  p     £   J 
k         j=0  d=0 


r,  e-2      e-lx  ._       . 
p  (-p   +a,  Qp   )  if  r  =  p-1 


r     e-1 
P  ar,0P 


(7) 
if  r  =0,1,  ...,p-2 


Now  in  (7),  p  oc    np  =       0,  since  r  §  1.  Also  in  (7), 

r,u  pe 

p  (-p  "  +a    np  ~  )  =  0  if  r  =  p-1  ^  2,  that  is  p  1  3-  Thus  (7)  becomes 
r,0      pe 


-pe_1  if  p  =  2 


e-1 

,r^  r-k.k      "  .r-k 

(k)p   i     Z       3  =  e  <l 

j=0       P   [_0     otherwise 


(8) 
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-l      e_1-i 

T*         T*  —  Tc     Y*— Tc     Tc 

Thus  for  k  =  0,   Z     Z   (,  )j   p   i  =   0,  unless  p  =  2.   But, 
i=0   j=0  K  pe   ■ 


p-1  p   -1 


rN  .r-k  r-k.k 


even  if  p  =  2  the  sum  is  zero  mod  p  ,  since  S     Z   (,  )j   p  "  i 


i=0   j=0 


p-1 


reduces,  by  use  of  (8),  to  Z  -p   ,  which  is  -p  ~  *p  =       0. 

i=0  Pe 


Case  3:   The  last  term  of  the  binomial  expansion,  r  =  k  >  0, 

-1       e_1-l 

/i\  .r-k  r-k.k 

Here,      Z  Z        L)3       p       i     reduces  to 

i=0       j=0 


1           e"1    n  i 

P-1  p           -1                           ,  p-1 

^  r             .r         e-1  .r 

Z           Zi=p  Si  =    i 


e-1 


p        (p-l+a       p)   if  r  =  p-1 


i=0      j=0 


i=0 


,      (9) 


e-1 


P         r,  r 


if  r=0,l,  ...,p-2 


where  Corollary  2  is  used  in  obtaining  the  last  equality.   This  reduces 
further  to 

e-1  . 


p-1  p   -1 


S 


Z   i  =  _  < 


i=0   o=0 


pe 


if  r  =  p-1 

if  r=0,l,2,  ...,p-2 


(10) 


T*   t*  —  Tc  T*  —  "k"  Tc 

Summing  Z     Z   (v)j   p   i  over  all  k,  and  using  the 

i=o   o=o  k 

results  of  cases  1,  2,  and  3,  completes  the  proof  of  the  lemma.  ■ 


Lemma  8:   If  pe|N,  pe+1  /n  and  e  ^  1  then  pe  /  S  iP_1 

i=0 
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N-l  1  N-l 

Proof:     pe|    £     iP~     if  and  only  if     E     iP~     e       0.     Now 
i=0    .  i=0  Pe 


N_1   -P-l  N     p  "1.p-l         e+1  »_  .     ..  v   N  ^     T 

Ei  -      —       E     i        •      p         /  N  implies  p  /  —   and,    by  Lemma   7, 

i.o     Pe  Pe  i=0  Pe 

P6-l      i  i  w   P6-i     1 

E  i^1  -  e  -p6-1.   Ihus  iL   E  i^1  ^  e  0.  ■ 
i=0       P  P   1=0       P 

It  is  now  possible  to  prove  Theorem  10. 

Proof  of  Theorem  10:  As  was  pointed  out  earlier,  all  that  there 
remains  to  prove  is  the  first  of  the  two  statements  found  at  the 
beginning  of  this  section.   To  this  end,  suppose  a  collection  of 
tx>y]  -lines,  p  a  prime  factor  of  N,  is  given,  C  =  {[x.,y.]  -lines | 
i=l, 2, . . ., I],  so  that  there  is  no  valid  periodic  skewing  scheme  for  this 
collection,  using  p  memory  modules.   The  assumption  of  the  existence  of 
a  valid  periodic  skewing  scheme,  using  N  memory  modules,  for  the  collection 
of  [x,y]  -lines,  C  =  ([x.  ,y.  ]  -lines  |i=l,  2,  . . .,  I},  will  lead  to  a 
contradiction . 

Suppose  a  valid  periodic  skewing  scheme  exists  for  C  using  N 

memory  modules.   Call  it  cp.   Then  exactly  N  points  on  the  torus  must  be 

mapped  by  the  skewing  scheme  into  memory  module  zero,  one  element  from 

each  of  the  N  disjoint  instances  of  the  [x  ,y  ]  -line.   Let  this  set  of  N 

points  be  { (u.,v. )  |  j=0, 1,  .  .  .,N-1}  .   Because  cp  is  assumed  to  be  a  valid 

periodic  skewing  scheme  for  C,  using  N  memory  modules,  for  any 

rx. ,y. ]  -line  in  the  collection  each  of  the  (u.,v.)  is  a  component  of  a 

different  instance  of  the  [x. ,y. ]  -line.   By  Lemma  2,  {x.u.-y.v.  mod  n| 

i'  l  N  '   l  j  l  J      ' 

j=0,l,  ...,N-1}  =  {0,1,2,  ...,  N-l},  for  1=1,2,  ...,I. 

N-l  x 

Thus,  by  Lemma  8,  pe  /  E  ((x.u.-y.v.)  mod  N)^  ,  for  i=l, 2,  ...,I, 

j=0        1  3      10 

where  e  is   chosen  so  pe|N,    pe+1  /n  and  ell,    this   last  since  p|N  by 
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N-l 


p-1 


assumption.   Note  that  Z  ((x.u.-y.v.)  mod  N)   "  does  not  depend  on  the 


j=0 


i  J   i  J 


choice  of  i.   This  sum  will  be  denoted  by  E. 

Since  there  is  assumed  to  be  no  valid  periodic  skewing  scheme 
for  G  using  p  memory  modules,  by  Lemma  k    either  (x.,y.,p)  /  1  for 
some  i,  or  I  I  p  +1  and  there  exist  non-zero  g.  and  a  permutation,  a, 
for  which  (xa(l),ya(l))  -p  (0,^),  ...,  (xa(p),ya(p) )  =  ,  (gp(p-D,gp), 

aM  (xa(P+l)^a(P+l))  "p  (gp+l,0)'  N°W  lf  (Vyi'p)  *  1   f°r  SOme  i? 
then,  since  p|N,  (x.,y.,N)  f   1  also.   This  contradicts  the  assumed 

validity  of  cp.   Therefore,  I  must  be  greater  than  or  equal  to  p+1  and 

there  must  exist  non-zero  g.  and  a  permutation,  o,    with  the  required 

properties.  Without  loss  of  generality,  assume  a  is  the  identity 

permutation. 

Consider  the  system 


p-1 


P-2, 


y, 


p-i 


P-2, 


yi   Xl   y2   X2 


p-1       p-1 
Xl        V 


— 

1 —        — 

p-1 
y 
^p 

71 

p-2 

p        p 

72 

p-1 

X    ^ 

p  '    _ 

7 

_  p_ 

A 


y 


p-i 


p+i 

p-2 
yp+l   Xp+1 


p-1 


P+1 


(11) 


where  the  matrix  is  called  M,  A  =  det(M),  and  R  =  [y   p~   ...  x      ]  , 

T 
and  r  =  [yn  7n  ...  n   are  unknowns  to  be  determined. 
'1  2     'pJ 

An  important  question  is:   Does  this  system  have  a  solution, 
and,  if  so,  is  it  unique?  The  answer  to  both  parts  of  this  question  will 
be  yes  if  A  /  0.   In  order  to  prove  A  ^  0  and  to  obtain  some  information 
about  the  form  of  the  y.,    the  system  in  (11 )  can  be  converted  into  a 
similar  system  in  the  field  of  residue  classes  mod  p. 
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When  the  elements  of  M  are  replaced  by  their  values  in  the 

field  of  residue  classes  mod  p,  the  resulting  matrix  is  the  M  of 

P 

Lemma  6.  To  see  this  observe  that  M.  .  =  y.   x.  "  "  =    g.   (g.(j-l))  "  '  =« 

i,J    J    3  P  J     J  P 

g,P~  (j-1)1   =_  (j-1)1  "  =  M    ,  where  (x  ,y  )  s  (g.(j-l),g.)  was  used, 
j  y  irjt    ^         j  j  y       j      a 

(recall  a  was  assumed  to  be  the  identity  permutation),  and  g.   "  =  1  by  Fermat's 

theorem.   This  observation  justifies  the  choice  of  the  name  M  in  Lemma  6. 

P 

Also  notice  that  A  mod  p  =  det(M  )  where  the  determinant  of  M 

P  P 

is  calculated  in  the  field  of  residue  classes  mod  p.   This  is  nothing  more 

than  observing  that  the  mod  operator  and  the  det  operator  commute.   Because 

A  mod  p  =  det(M  ),  A  is  a  reasonable  notation  to  use  for  either  of  these 
^      v  p  '      p 

quantities.   In  a  manner  similar  to  that  used  in  the  proof  that  M  is 

converted  to  M  by  the  mod  operator,  R  is  converted  to  R  =  [0  0  ...  0  1]  . 

By  Corollary  1  of  Lemma  6,  A  4.  0,  so  A  /  0.   Thus  the  system 

P  V 

MT=  AR  and  the  system  M  X=  A  R  have  unique  solutions.   The  reader  is 

P     P  P 

cautioned  that  despite  the  fact  that  both  systems  have  unique  solutions, 

it  is  not  obvious  that  the  mod  operator  applied  to  the  y.    converts  P  to  X . 

The  reason  for  this  is  that  the  y.   might  not  be  integers,  i.e.,  if  the  y. 

are  only  rational  it  makes  no  sense  to  consider  y.   mod  p.   If,  however, 

all  the  y.    are  integers  then  X  will  in  fact  be  r  ,  the  column  vector 

obtained  by  replacing  y.    with  y.   mod  p. 

It  is  possible  to  show,  however,  that  y.    is  an  integer  by  solving 

MT=  AR  by  means  of  Cramer's  rule.  When  using  Cramer's  rule,  y.    is 

calculated  by  replacing  column  i  of  M  by  AR,  getting  a  new  matrix,  M. , 

det(M. ) 
i 
and  then  y.    =  . 

1    det(M) 
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7± 


By  using  common  rules  for  manipulating  determinants  [8], 

Adet(M.' ) 

-  -   det(M.' ),  where  M.'  is  the  matrix  formed  by  replacing 


A 


column  i  of  M  by  R.   Since  every  element  of  Ml    is  an  integer,  det(M. ) 
is  also  an  integer.   Thus  r  consists  solely  of  integers,  and  X  =  r  • 

In  order  to  complete  the  argument,  by  arriving  at  the 
contradiction  mentioned  at  the  beginning  of  the  proof,  it  is  convenient 
to  determine  the  form  of  r.   This  can  be  done  by  first  determining  the 
form  of  r  •   This  is  easily  done  directly.   In  the  proof  of  Corollary  1 

of  Lemma  6,  it  was  shown  that  M  "  "  =  (p-l)M1 .  As  noted  earlier 

'  P  P 


R  =  [0  0  ...  0  1]  ,  sor   =  (p-l)M'A  R  =  (p-l)A 


P  P  P 


since 


r(p-l) 

o-(P-D 


(p-1) 


-(p-1) 


1 


-(p-1) 

r(P-D 


(P-D 


-(p-i: 


is  the  last  column  of  M' .   But  by  Fermat's  theorem 

P 


X-(P-D 
2-(p-D 

(P-1)-(P-1} 


,  sor  = 


A  (p-1) 
P 

A  (p-1) 
p 

A  (p-1) 
p  . 


A  (p-1) 
p 


Renaming 


A  (p-1)  =  c,  it  follows  that  r 


distinct. 


c  +&XP 
c+62p 


c  +6  p 
P 


,  where  the  S.  may  all  be 
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Finally  consider  the  sum 

N-l  ,  N-l 

(c^zj^.-y^.)*-   -'-(c+P6p)^o(xpu.-ypv.)P-  % 

(c+p61)Z  +    (c+p62)E  +    •••    +    (c+p&    )Z   s^  pcE  +  p  H(&1  +62+-  •  •-*    )    ^ 


p-V,    where  V  =  &n  +&~  +  •••46  +c 
12  p 


Now  expand  the   sum  another  way. 

N-l  N-l 

(c+po    )    Z   (x  u  -y  v   )p~   +..-+(c+p5    )    Z  (x u  -y  v   )P~     = 
-1-    t_q  J  <J  "    i=0 

.,    N-l  0        N-l 

(c+p51)xLP"1     Z     uP_1     +   (c+P&1)(p-l)(-l)xp-2y       E     uP"2v.+  ...+ 
j=0      J  j=0      J  J 

(c+P61)(-l)p-1y1P'1"^P"1  I 

i    N"1  i  o        N"X  o 

+      (c+P5p)xp-1     Z     up-X     +    (c+p5   )(p-l)(-l)x  P"2y       Z     u .p_2v . +  .  .  .  + 

j=0      J  j=0      J  J 

.  .    N-l  .       (IS 

(c+P52)(-l)p-V         2    vp-X 


1  N_1  1  2        N_1  2 

+      (c+P6p)xpp-        Zq  u.P"        +    (c+P6p)(p-l)(-l)xpp-  yp  ^  u.P"  v.+  —  + 

(c+Po  )(-l)p"1yp-1  Vv.P"1 

P  P  j=0      ° 
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If  the  terms  in  the  sum  of  equation  (12)  are  added  together  by 
columns,  the  result  is 


(■ip"1)(ii(o^i)XiP"1+(p"1)("1)(Iv^p"2Tj)(1i(c+p5i)XiP"Syi 


(13) 


But  by  considering  the  system  MF=  AR,  and  the  value  for  r  derived  above, 

/   ^  \   p-k  k-1   .    p-k    k-1    ,  ,    , ,  .   .    ,....,.. 
Z  (c+p&.)x.   y.    =  Ax  _   y  _    ,  and  when  this  is  substituted  into 
.__  v  *i  l    i       p+1    p+1   ' 

(13)  the  sum  becomes 

f^u.^1)  Ax^^+Cp-lX-l/^u.^v.^Ax   P"2y  , 

t/N-1     \  N-l  . 

+  •••  +(-l)P    Z  v.P_1  Ay   P   =A  Z  (x  _u.-y  nv.)P~1=TAE 
\j=0  J   /    P+1        J=0  P+1  J  P+1  J      N   . 

Combining  this  with  the  form  of  the  sum  derived  earlier, 
p  HV  =  A  H  or  H(pV-A)  =  0.   Now  e  was  chosen  earlier  in  the  proof  so 
that  pe|N,  but  pe  J  H.  However,  if  E(pV-A)  =  0  then  pe|H(pV-A).   Thus  p 
must  divide  pV-A.   Since  p|pV,  this  implies  p|A.   But  p  /a,  since  A^  ^  0. 

XT      }J 

This  is  the  desired  contradiction,  and  the  theorem  is  proven.  ■ 

3-5  Further  Results  and  Examples 

In  Sections  3.1  through  3.^-  it  was  shown  that  given 
N  memory  modules  and  a  collection  of  [x,y]  -lines,  restricting 
consideration  to  linear  skewing  schemes  suffices  to  determine  the  existence 
of  valid  periodic  skewing  schemes  for  the  collection.   It  is  natural  to 


7^ 

ask  whether  this  result  can  be  extended  farther,  so  that  by  consideration 
of  only  linear  skewing  schemes,  the  question  of  the  existence  of  an 
arbitrary  skewing  scheme,  valid  in  the  plane,  can  be  settled.  A 
limited  answer  is  given  by 

Theorem  11:  Given  N,  the  number  of  memory  modules, 
and  a  collection  of  [x,y]  -lines,  {[x.,y. ]  -lines | 
i=l,  2,  . . .,  I},  if  there  exists  two  sequences  of 

integers,  (a^a^  . .  .,a].)  and  (b^bg,  . .  .,bz),  such 

I  I 

that  £  (a  x  ,a  y  )  =  (0,1)  and  2  (b  x  ,b  y  )  =  (1,0), 
i=l  i=l 

then  if  there  does  not  exist  a  valid  linear  skewing 

scheme  for  this  collection,  there  is  no  valid  skewing 

scheme  for  this  collection. 

Proof:   The  proof  is  quite  simple.   The  conditions  of  this 
theorem  and  the  result  of  Theorem  10,  imply  there  is  no  valid  periodic 
skewing  scheme  for  this  collection  of  [x,y]  -lines.   The  conditions  on 
the  sequences  of  a. 's  and  b. '  s  will  be  seen  to  imply  that  if  there  are  any 
valid  skewing  schemes,  then  they  are  periodic.  This  will  establish  the 
theorem. 

Notice  that  for  any  valid  skewing  scheme,  cp,  cp(i,j)  = 
cp(i+Ny,  ,  j+Nx_  ),  for  k=l,2,  ...,I  and  any  i  and  j.  This  is  so,  because 
cp(i,j),  cpCi+y^j+s^),  ...,cp(i+(N-l)yk,j  +  (N-l)xk)  must  all  be  distinct, 
since  ((i,  j),  (i+yk,  J+\)>  •  ■  •>  (i  +  (N-l)yk,  j  +  CN-l)^)).  is  an  instance  of  the 
txk'ykVline'   similarly>  cP(i+yk,J+xk),  cp(i+2yk,o+2xk),  . . .,  cp(i+Nyfc,  j+Nx^ 
must  all  be  distinct,  since  ((i+y^j+x^),  (i+2yk,  j+2^),  . . .,  (i+Nyfc,  j+Nx^)) 
is  also  an  instance  of  the  [x,  ,y,  ]  -line.  Since  these  two  instances  have 
N-l  ordered  pairs  in  common,  the  pigeonhole  principle  requires  that 
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cp(i,  j)    =  cp(i+Ny  ,  j+Nx_  )  .      From  this  it  is   clear  that  cp(i,  j)   = 
cp(i+£lty.  ,0+iNxi  ),    for  £=...,-1,0,1,...   and  any  i  and  j. 

Since  k  was  arbitrary,    it  follows   that   for  any   (c,  d), 

II  II 

cp(c,d)   =  cp(c  +   Z  a.Ny.,    d+    S  a.Nx. )    =  cp(c  +  N     2  a.y.,d  +N     X  a.x.)   = 
i=l  X     X  i=l  X     X  i=l  X  X  i=l  X   X 

cp(c+N,d).   Similarly,  by  using  the  sequence  of  b.  's,  cp(c,d)  =  cp(c,d+N). 
Since  (c,  d)  was  arbitrary,  cp(c,  d)  =  cp(c+N,  d)  =  cp(c,  d+W)  establishes 
that  cp  must  be  a  periodic  skewing  scheme.  H 

This  condition,  restrictive  though  it  may  be,  is  sufficient 
to  resolve  the  most  important  practical  case:  {[1,0]  -line,  [0,1]  -line, 
[1,1]  -line,  [1,-1]  -line}.  The  sequences  (0,1,0,0)  and  (1,0,0,0) 
suffice,  clearly.  Thus  for  this  important  case,  considered  by  Budnik  and 
Kuck  [3]  and  Lawrie  [10],  if  there  does  not  exist  a  linear  skewing  scheme 
using  N  memory  modules,  and  there  does  not  when  N  is  a  power  of  two,  then 
there  is  no  valid  skewing  scheme  of  any  type  whatsoever. 

It  is  easy  to  allow  oneself  to  be  misled  by  the  conclusion  of 
Theorem  10.  When  given  a  collection  of  [x,y]  -lines,  and  deciding  on  a 
skewing  scheme  using  N  memory  modules,  there  may  be  advantages  to  choosing 
a  non-linear,  but  still  periodic,'  skewing  scheme.   Some  periodic  skewing 
schemes  can  be  so  simple  that  they  take  very  little  hardware  to  perform 
address  computation  and  to  align  the  data  with  the  correct  processor, 
even  less  hardware  than  required  by  linear  skewing  schemes.   One  such 
periodic  skewing  scheme  has  been  used  in  the  construction  of  an  actual 

machine,  the  STARAN  [1].  Abstracting  from  the  exact  details  of  the  STARAN 

n   n 
design,  the  programmer  views  memory  as  consisting  of  a  2  x2  array.   In 

the  language  of  the  designers  of  the  STARAN,  the  programmer  views  the 

memory  as  having  2  words  of  2  bits,  and  the  programmer  can  indicate 
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he  wants  to  fetch  all  the  bits  of  one  word,  or  a  bit-slice,  the  j^h 

bit  of  each  word  (Figure  22).   In  the  terminology  used  here,  this  is 

equivalent  to  fetching  arbitrary  instances  of  the  [1,0]   -line  and  the 

[0,1]   -line  from  an  array  of  data  elements,  using  2  memory  modules 
2n 

to  store  the  data,  and  using  a  periodic  skewing  scheme. 

The  skewing  scheme  employed  is  cp(i,  j)  =  i  mod  2  ©  j  mod  2  , 

n  n 

where  i  mod  2  and  j  mod  2  are  expressed  in  binary  notation  and  ©  is 

exclusive-or.   This  periodic  skewing  scheme  is  explicitly  calculated 

for  an  8x8  array  in  Figure  23-  Unlike  the  other  machine  designs 

discussed  earlier,  the  responsibility  of  deciding  on  a  skewing  scheme 

does  not  rest  with  the  programmer  or  compiler,  but  is  built  directly  into 

the  hardware.   Indeed  the  user  of  the  STARAN  need  not  even  know  that  a 

skewing  scheme  is  employed;  by  appropriately  setting  the  global  address 

register,  G,  and  the  access  mode  register,  M,  either  the  correct  word  or 

bit-slice  will  be  made  to  appear  at  the  processing  elements.   Additional 

ways  of  setting  M  allow  some,  but  not  all,  instances  of  other  generalized 

lines  (but  not  [x,y]   -lines)  to  be  fetched  without  memory  conflict. 

2n 

The  reason  that  this  skewing  scheme  is  of  practical  importance 
is  that  the  address  computations  needed  to  fetch  instances  of  the  [1, 0]  n-line 
and  the  [0,1]   -line  can  be  done  using  only  n  exclusive-or  gates.   This 
compares  to  2n  adders  that  would  be  needed  if  a  linear  skewing  scheme  was 
employed.  Additionally,  an  exclusive-or  can  be  performed  in  less  time  than 


What  has  been  called  memory  here  is  actually  the  memory  of  a  single  array 
module  in  the  STARAN.  Each  array  module  has  a  memory  consisting  of  a  256  x 
256  array  of  bits.   This  memory  physically  consists  of  256  independent 
memory  modules,  each  with  256  one-bit  words.  By  the  skewing  scheme 
described  in  the  body  of  the  text  all  the  bits  of  any  word  and  all  the  bits 
of  any  bit-slice  lie  in  different  memory  modules,  and  can  be  fetched  in  one 
memory  cycle.  Readers  interested  in  exact  implementation  details  and  the 
terminology  used  by  the  STARAN  designers  should  consult  [1]. 
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Figure  22:      Programmer's  View  of  STARAN  Memory. 
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Figure  23:   The  Periodic  Skewing  Scheme 
Used  in  the  STARAN  Computer. 
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an  addition,  since  there  is  no  carry  propagation.   The  memory-processor 
connection  network  required  in  the  STARAN  is  of  some  order  of  complexity 
as  Lawrie's  ft-network.   The  reader  interested  in  the  details  of  address 
computation,  a  proof  of  the  validity  of  the  skewing  scheme,  and  the 
details  of  the  memory-processor  connection  network  should  consult 
Batcher  [1] . 

This  example  was  presented  to  show  that  non-linear  periodic 
skewing  schemes  can  be  important  in  actual  practice,  even  when  there  are 
valid  linear  skewing  schemes.   In  comparing  STARAN  to  the  more  general 
machine  modeled  in  Figure  J,  it  should  be  noted  that  in  STARAN  the 
generalized  lines  that  the  programmer  can  access  conveniently  were  fixed 
at  the  time  of  design.   For  these  generalized  lines  the  programmer  need 
not  concern  himself  with  skewing  schemes,  as  the  hardware  handles  the  data 
storage  and  unscrambling  automatically.   In  the  more  general  computer, 
modeled  in  Figure  3,  the  determination  of  an  appropriate  skewing  scheme 
is  left  to  the  programmer,  who  may  be  restricted  in  his  choices  by  the 
nature  of  the  memory-processor  connection  network.  While  this  may  be  more 
work  for  the  user,  it  allows  greater  flexibility  than  is  available  in  the 
STARAN.   Some  authors  [11]  have  discussed  leaving  the  choice  of  skewing 
scheme  and /or  the  address  computation  to  the  compiler,  thus  freeing  the 
programmer  from  this  bookkeeping. 
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k.      UNRESOLVED  PROBLEMS  AND  DIRECTIONS  OF  FURTHER  RESEARCH 

k.l     The  Effectiveness  of  Linear  and  Periodic  Skewing  Schemes 

One  question  that  has  occurred  throughout  this  thesis  is: 

Given  a  collection  of  generalized  lines,  when  can  the  search  for  a  valid 

skewing  scheme  for  this  collection  safely  be  restricted  to  certain 

subclasses  of  skewing  schemes,  that  is,  when  does  a  valid  skewing  scheme 

from  a  class  of  skewing  schemes,  imply  a  valid  skewing  scheme  from  a 

subclass  of  this  class  of  skewing  schemes?  In  Chapter  1,  it  was  shown 

that  for  any  collection  of  generalized  lines,  attention  could  safely  be 

restricted  from  skewing  schemes  valid  on  the  quarter  plane  to  skewing 

schemes  valid  on  the  entire  plane.   Similarly,  Theorem  10  shows  that  for 

collections  of  [x, y]  -lines,  attention  can  safely  be  restricted  from 

periodic  skewing  schemes,  using  N  memory  modules,  to  linear  skewing 

schemes,  using  N  memory  modules .  In  general,  for  an  arbitrary  collection 

of  generalized  lines,  the  question  of  when  attention  can  be  restricted 

from  arbitrary  skewing  schemes  defined  on  the  plane  to  periodic  skewing 

schemes,  and  from  periodic  skewing  schemes  to  linear  skewing  schemes  is 

unresolved.   In  this  section  some  conjectures,  partial  results,  and 

interesting  examples  are  presented  for  a  class  of  generalized  lines  called 

polyominoes  [5,6]. 

Definition  11:  A  polyomino  is  a  generalized  line  in 

which  given  any  two  components,  (x  ,y  )  and  (x  ,y  ), 

there  exists  a  path  (x  ,y  )  =  (x  ,y  ),  (x  ,y  ),..., 

11     2   2 
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(x±  ,y±   )  =  (xT,yT)  such  that  (x±  ,y±   )  is  a 
r   r  j   j 

component  of  the  generalized  line,  for  5=1,2,  .  ..,r 

and  either  x.    =  x.   ±  1  and  y.    =  y.   or 

3+1     3  3+1     J 

x     =  x.   and  y^^    =  y   ±  1,  for  j=l,2,  . .  .,r-l. 

3+1     3       3+1     3 


In  the  geometric  realization  of  a  generalized  line  "by  unit 
squares,  a  polyomino  is  a  generalized  line  that  is  connected.   Except 
for  the  disconnected  shape  of  Figure  20,  all  the  generalized  lines  used 
as  examples  in  Chapter  2  are  polyominoes.   There  is  an  impressive 
amount  of  literature,  and  many  unsolved  problems  concerning  polyominoes. 
A  general  source  is  [6]. 

Conjecture:   Given  N  memory  modules,  and  a  polyomino 
of  length  N,  then  if  there  is  a  valid  skewing  scheme 
for  the  polyomino,  there  is  also  a  valid  periodic 
skewing  scheme  for  the  polyomino . 

This  conjecture  is  supported  by  consideration  of  a  construction, 
illustrated  initially  by  example.   Consider  the  generalized  line  whose 


geometric  realization  is 


Theorem  5  proves  that  the  problem 


of  finding  valid  skewing  schemes,  is  equivalent  to  determining  tesselations 
of  the  plane.  With  the  objective  of  analyzing  possible  tesselations,  lay 
down  an  instance  of  this  generalized  line  (see  Figure  2k).     Without  loss  of 


In  the  literature,  polyominoes  are  usually  defined  to  be  the  geometric 
realizations  of  the  class  of  generalized  lines  described  by  Definition  11. 
Additionally,  unlike  here,  in  most  problems  concerning  polyominoes, 
rotations  and  reflections  of  a  polyomino  are  permitted  and  are  not  regarded 
as  generating  different  polyominoes.  Additionally,  a  comment  should  be 
made  about  connectedness.  Here,  connected  means  connected  by  more  than  a 
corner.  This  kind  of  connectedness  has  been  called  rook-wise  connected, 
because  of  the  permissible  motions  of  the  chess  piece  by  the  same  name. 
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Figure  2k:      Positioning  Four  Instances  of  the  Generalized  Line 

((0,0),  (1,0),  (1,1),  (2,0),  (2,1)),  so  their  Designated 
Elements  Form  a  Parallelogram. 
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generality  the  designated  element  can  be  assumed  to  be  at  (0,0).  Now, 
there  is  only  one  way  a  second  instance  can  be  positioned  so  that  the 
square  labeled  A  is  covered  without  overlapping  of  the  instances 
(Figure  2k(h)) .     Again  there  is  only  one  way  to  position  a  third 
instance  so  the  square  labeled  B  is  covered  without  any  overlapping  of 
the  instances  (Figure  2^(c)).   Finally,  there  is  only  one  way  for  a 
fourth  instance  to  be  positioned  so  that  the  square  labeled  C  is  covered 
and,  again,  there  is  no  overlapping  of  the  instances  (Figure  2^4- (d.) )  . 
Notice  that  the  designated  elements  form  the  vertices  of  a  parallelogram 
which  contains  "no  holes,"  and  has  area  N,  the  length  of  the  polyomino. 
Now,  by  replication  of  the  parallelogram,  it  is  clear  that  a  tesselation 
of  the  plane  results.   The  tesselation  that  results  is  very  orderly. 
Sometimes,  particularly  when  the  polyomino  has  a  high  degree  of  symmetry, 
the  construction,  informally  presented  above,  can  yield  more  than  one 
parallelogram.   This  is  illustrated,  in  Figure  25,  by  the  generalized  line, 


whose  geometric  realization  is 


L 


-shaped.   The  generalized  line, 


whose  geometric  realization  is   ,        -shaped,  exhibits  the  same 
phenomenon . 

When  four  instances  of  a  polyomino  of  length  N  can  be  laid  down 
so  their  designated  elements  form  the  four  vertices  of  a  parallelogram  of 
area  N  and  which  contains  no  holes,  then  the  tesselation  of  the  plane, 
produced  by  replicating  the  parallelogram,  induces  a  periodic  skewing 
scheme.   The  proof  of  this  statement  is  reminiscent  of  the  proof  of 
Theorem  11.   If  the  vertices  of  the  parallelogram  are  labeled  as  in 


In  this  example,  N  =  5« 


8k 


v 

< 

> 

V 

(a) 

Figure  25:  Alternate  Positionings  of  Instances  for  the 

Generalized  Line  ((0,0),  (1,0),  (1,1),  (1,2),  (2,2)) 
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Figure  2^(d),  then  letting  cp  "be  the  skewing  scheme  induced  by  the 
tesselation,  it  is  clear  that 

cp(c,d)  =  ep(c+p,d+q)  =  cp(c+r,d+s),  (lk) 

for  any  (c,d).  Now  |ps-qr|  =  N  since  |ps-qr|  is  the  area  of  the 
parallelogram,  and  thus  (c,d)+(-rp, -rq)+(pr,ps)  =  (c,  d+N)  (or  (c,d-N), 
depending  on  the  sign  of  ps-qr) .  Combining  this  with  (lk)   implies 
cp(c,d)  =  cp(c,d+N)-   Similarly,  (c,d)+(sp,  sq)+(-qr, -qs)  =  (c+N, d)  (or 
(c-N,d)),  and,  thus,  cp(c,d)  =  cp(c+N,d).   Since  (c,d)  was  arbitrary,  cp  is 
periodic. 

The  obstacle  standing  in  the  way  of  a  proof  of  the  conjecture 
is  the  inability  to  prove  that  if  a  polyomino  tesselates  the  plane,  which 
implies  the  existence  of  a  valid  skewing  scheme,  then  four  instances  of 
the  polyomino  can  be  positioned  so  that  the  designated  elements  form  the 
vertices  of  a  parallelogram  of  area  N.   This  has  been  checked  by  hand  for 
all  polyominoes  through  N  =  7  and  no  exceptions  have  been  found.   It  is 
reasonable  to  believe  that  this  is  in  fact  so,  since,  for  most  polyominoes 
there  is  usually  only  one  way  a  second  instance  can  be  positioned  so  that 
some  carefully  chosen  square  is  covered,  and  at  the  same  time  the  instances 
do  not  overlap.   This  style  of  argument  was  used  to  show  that  the 


cannot  tesselate  the  plane  (Figure  12),  as  well  as  to  construct 
the  unique  tesselation  of  the  plane  (except  for  rigid  shifting)  by  the 

,  (Figures  Ik   and  2k) .   Polyominoes  with  some  symmetry,  however, 


frequently  produce  several  distinct  tesselations. 
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Even  though  it  seems  reasonable  to  believe  that  if  a 
polyomino  tesselates  the  plane,  the  construction  of  a  parallelogram 
of  area  N  will  always  be  possible,  there  is  some  evidence  to  the 
contrary.   Some  authors  have  considered  tesselating  the  plane,  while 
allowing  simultaneous  use  of  several  different  polyominoes.   Examples 
have  been  reported  of  collections  of  polyominoes  from  which  a  tesselation 
of  the  plane  can  be  constructed,  but  from  which  no  periodic  tesselation 
(with  any  period  whatsoever)  can  be  constructed  [5]. 

Because  the  example  presented  below  indicates  that  there  may 
be  some  unrecognized  subtleties,  to  close  the  discussion  of  this 
conjecture,  an  alternate  approach  to  its  proof,  known  to  be  inadequate, 
will  be  discussed.   Suppose  a  polyomino  of  length  N  is  given,  and  f  is  a 
valid  skewing  scheme  for  this  polyomino.   Then,  define  cp(i,j)  =  \|r(i  mod  N, 
j  mod  N) .  cp  is  periodic,  clearly.   The  objective  is  to  show  that  cp  is 
valid.   This  approach  is  motivated  by  consideration  of  the  [1,0]  -line. 
Clearly,  any  valid  skewing  scheme,  \|/,  has  the  property  \|/(c,d)  =  \(r(c,d+N) 
for  any  (c,d).  However,  it  is  easy  to  construct  \|/  so  that  \J/(c,d)  ^  \|/(c+N,d), 
i.e.  \|/  is  non-periodic  in  the  vertical  direction.  Note,  however,  that  cp, 
defined  by  cp(i,  j)  =  \|/  ( ±  mod  N,  j  mod  N)  is  both  periodic  and  valid  for  the 
[1,0]  -line,  provided  \|/  is  valid.   This  technique  also  works  on  generalized 
lines  whose  geometric  realization  is  an  IxL  rectangle,  where  N  N  =  N. 
However,  consideration  of  the  generalized  line  L  =  ( (0,  0),  (0,  l),  (l,  l),  (1,2)), 

whose  geometric  is   I .   ,  shows  that  such  an  approach  to  the  proof  of 

These  tesselations  should  be  carefully  distinguished  from  those  used  in 
Theorem  6.   There,  several  tesselations  were  constructed  using  different 
generalized  lines,  but  each  tesselation  used  instances  of  only  one  type. 
In  the  case  mentioned  here,  several  different  polyominoes  were  used  to 
construct  a  single  tesselation. 
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the  conjecture  is  inadequate.   Figure  26  indicates  a  tesselation, 
which  induces  a  valid  skewing  scheme,  \|r,  for  which  cp,  defined  by  cp(i,j)  = 
\|r(i  mod  N,  j  mod  N)  is  not  a  valid  skewing  scheme.   There  are,  however, 
valid  periodic  skewing  schemes  for  L. 

If  this  conjecture  is  true,  then  given  N  memory  modules  and  a 
polyomino  of  length  N,  when  looking  for  a  valid  skewing  scheme  attention 
can  safely  be  restricted  to  periodic  skewing  schemes .  A  good  question 
to  ask  is:   Can  attention  safely  be  restricted  to  linear  skewing  schemes? 
While  the  answer  to  this  question  is  "no,  "  in  general,  if  N  is  prime  and 
if  the  existence  of  a  tesselation  by  a  polyomino  implies  that  the 
construction  described  earlier  results  in  a  parallelogram  of  area  N, 
then  the  answer  is  "yes."  Using  the  notation  of  Figure  2^(d),  for 
cp(i,  j)  =  ai+bj  mod  N,  define  a  and  b  by: 

if  (p,q)  ^  (0,0)  then  a  =  -q,  b  =  p 
else  a  =  1,  b  =  1  . 

To  see  that  cp  is  valid,  observe  first  that  for  the  exceptional  case 

(p,  q)  =  (0,0),  either  (p,  q)  =  (±N,0)  or  (p,  q)  =  (0,  ±N).   This  is  true 

because  the  polyomino  is  connected  and  has  only  N  components .  Now 

(p>  Q.)  =  (±N,  0)  or  (0,  ±N)  implies  that  the  polyomino  is  a  [0,1]  -line  or 

a  [1,0]  -line,  and  then  cp(i,j)  =  i+j  mod  N  is  indeed  valid.  When  a  =  -q 

and  b  =  p,  that  cp  is  valid  can  be  seen  as  follows.  Note  that 

^(PjQ.)  =  -QP+pq.  mod-  N  =  0,  and  cp(r,  s)  =  -qr+ps  mod  N  =  0.   Thus  adding 

multiples  of  (p,  q)  and/or  (r,  s)  to  a  point  does  not  effect  the  value  of  cp, 

Because  of  this,  and  the  orderly  way  in  which  replications  of  the 

parallelogram  tesselate  the  plane,  if  cp  maps  all  the  components  of  one 

instance  to  distinct  memory  modules,  then  cp  is  valid.  However,  by  adding 
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Figure  26:  A  Non-periodic  Skewing  Scheme,  ii,    for  which  <P(i,j)  = 
i  mod  N,  j  mod  N)  is  not  Valid. 
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(p, q)  and/or  (r,  s)  to  some  components  of  the  L(0,  0)  instance,  where  L 
is  the  polyomino,  the  condition  on  cp  can  be  changed  to:   cp  maps  into 
distinct  memory  modules  all  ordered  pairs  (of  integers)  interior  to 
the  parallelogram  or  lying  on  the  edges  whose  endpoints  are  (0,0)  and 
(p,  q),  and  (0,0)  and  (r,  s) ,  but  not  including  (p,  q)  and  (r,  s). 
Figure  27  illustrates  the  situation.   Now,  cp  will  be  the  same  for  two 
ordered  pairs  of  integers  interior  to  or  lying  on  the  edges  of  the 
parallelogram  if  and  only  if  the  line  segment  connecting  them  is  parallel 
to  the  line  segment  from  (0,0)  to  (p, q),  since  only  then  will  |-qi  +pi  |, 
the  area  of  parallelogram  I  in  Figure  28,  equal  |-qi  +pj  |,  the  area  of 
parallelogram  II  in  Figure  28.   The  existence  of  two  ordered  pairs  of 
integers  interior  to  or  lying  on  the  edges  of  the  parallelogram,  for 
which  the  line  segment  connecting  them  is  parallel  to  the  line  segment 
connecting  (0,0)  to  (p, q)  implies  the  existence  of  an  ordered  pair  of 
integers  lying  on  the  line  segment  connecting  (0,0)  to  (p,  q) .   But  this 
can  occur  only  if  p  and  q  have  a  common  divisor  other  than  one.   Then  this 
common  divisor,  which  is  less  than  N,  divides  |ps-qr|  =  N,  which  is 
impossible  since  N  is  prime.   This  contradiction  implies  cp  is  a  valid 
linear  skewing  scheme. 

When  N  is  not  a  prime  number,  it  is  not  always  possible  to 
restrict  attention  to  linear  skewing  schemes,  when  determining  if  valid 
skewing  schemes  exist  for  a  polyomino.   Figure  29  illustrates  that  four 
instances  of  the  generalized  line  L  =  ( (0,  0),  (0,1),  (0,2),  (0,3),  (l,  0),  (l,  l), 
(1,2),  (1,3),  (2,0),  (2,1),  (3,0),  (3,1))  can  be  positioned  so  that  the 
designated  elements  form  a  parallelogram  of  area  N  (N  =  12) .   Thus  there 
is  a  valid  periodic  skewing  scheme  for  L.   There  is  no  valid  linear  skewing 
scheme  for  L,  as  trying  all  the  possibilities  indicates.  Notice  that  the 
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Figure  27:  Examples  of  Translating  by  (p,  q)  and/or  (r,  s),  so 
that  all  the  Components  of  the  Instance  of  the 
Generalized  Line  Lie  Interior  to  the  Parallelogram. 
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parallelogram  II 


parallelogram  I 


parallelogram  formed  from 
the  designated  elements 


Figure  28:   Components  of  an  Instance  of  a  Generalized 
Line,  after  Translation  by  (p, q)  and/or 
(r, s),  which  are  Stored  in  Same  Memory 
Module . 
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Figure  29:  An  Example  of  a  Polyomino  for  which  There  is 
a  Valid  Periodic  Skewing  Scheme,  but  no  Valid 
Linear  Skewing  Scheme . 
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line  segments  connecting  (0,0)  and  (p,  q)  and  (0,0)  and  (r,  s)  pass 
through  ordered  pairs  of  integers.   It  is  frequently  the  case  that  for  N, 
a  composite  number,  one  of  these  line  segments  will  not  pass  through 
any  ordered  pairs  of  integers.   In  such  a  case,  as  in  the  proof  outlined 
above,  cp(i,j)  =  ai+bj  mod  N,  will  be  a  valid  linear  skewing  scheme, 
where  a  =  -q  and  b  =  p  or  a  =  -s,  b  =  r,  depending  on  which  line  segment 
does  not  pass  through  an  ordered  pair  of  integers. 

In  summary,  there  is  evidence  to  believe  that  in  determining  the 
existence  of  a  valid  skewing  scheme  for  a  polyomino,  using  the  same 
number  of  memory  modules  as  the  length  of  the  polyomino,  only  periodic 
skewing  schemes  need  be  considered,  and  if  N  is  prime  attention  can  be 
restricted  further,  to  linear  skewing  schemes. 

k  .2     Questions  Relating  to  Memory  Utilization 

Throughout  much  of  this  work  attention  has  been  focused  on  the 
case  where  the  length  of  the  generalized  line(s)  and  the  number  of 
memory  modules  are  equal.  Motivation  for  this  was  presented  at  the 
beginning  of  Chapter  2  by  considering  computers  designed  as  in 
Figure  3»  When  dealing  with  [x,y]  -lines,  setting  the  length  of  the 
generalized  lines  equal  to  the  number  of  memory  modules  is  an  honest 
reflection  of  the  real  situation.   In  actual  computations  involving 
[x,y]  -lines,  and  in  particular  the  [1,0]  -line  and  the  [0,1]  -line,  what 
is  often  wanted  is  an  entire  row  or  column  of  a  matrix.  However,  the 
number  of  processors  in  the  actual  computer  dictates  that  the  row  or 
column  can  be  processed  no  faster  than  N  elements  at  a  time,  where  the 
number  of  processors  is  taken  to  be  equal  to  the  number  of  memory  modules, 
N.   Since  N  elements  of  the  row  or  column  is  all  that  can  be  fetched  and 


9^ 


processed  in  one  memory  cycle,  clearly  [x,  y]  -lines  are  the  natural 
candidates  for  consideration. 

On  the  other  hand,  when  a  computation  uses  the  generalized 
line  L  =  ((0,0),  (0,1),  (0,2),  (1,1),  (2,0),  (2,1),  (2,2)),  whose  geometric 


realization  is 


-shaped,  if  the  number  of  processors  and  the 


number  of  memories  is  greater  than  the  length  of  the  generalized  line, 
then  some  components  of  the  computer  may  just  have  to  idle;  the  algorithm 
may  not  be  able  to  utilize  the  full  potential  of  the  machine.  Having 
extra  hardware  available,  which  cannot  be  utilized  by  an  algorithm,  may 
still  result  in  improved  performance,  however.   To  see  this,  again 
consider  an  example  presented  in  Chapter  2.    Figure  12  indicated  why 


the 


cannot  tesselate  the  plane,  and  hence  there  exists  no 


valid  skewing  scheme,  using  only  five  memory  modules,  for  the  generalized 
line,  L=  ( (0, 0),  (0, l), (0,2),  (l, l),  (2, l) ) .  However,  if  the  computer  has 
more  than  five  memory  modules,  perhaps  a  valid  skewing  scheme  can  be 
found  which  uses  six  or  more  memory  modules.   This  motivates 

Definition  12:  A  generalized  line,  L',  is  called 

a  cover  for  another  generalized  line,  L,  if  every 

component  of  L  is  contained  in  L' . 

Notice  that  if  cp  is  valid  for  a  cover  of  t;he  generalized  line, 
L1,  then  cp  is  also  valid  for  L.   The  length  of  the  generalized  line  used 
as  a  cover  may  conveniently  be  chosen  to  be  the  number  of  memory  modules 
of  the  computer  and  then  Theorem  5  can  be  applied  to  determine  valid 
skewing  schemes  for  the  cover. 
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In  general,  given  a  collection  of  generalized  lines, 

[L-tI>of  .  •  >,L   },  not  necessarily  all  of  the  same  length,  determination 

of  the  minimum  number  of  memory  modules,  N,  so  that  a  valid  skewing 

scheme,  using  N  memory  modules,  exists  for  the  collection,  is  quite 

difficult.   A  simple  heuristic  is  to  find  a  collection  of  generalized 

lines,  {L',  L',  . .  .,L'},  such  that  L.'  is  a  cover  of  L. ,  the  L.'  are  all 
'   j.'   2'       '   p  '  1  i/      i 

of  the  same  length,  and  such  that  they  satisfy  the  conditions  of 
Theorem  6.   The  resulting  skewing  scheme  will  also  he  valid  for 
{L  ,  L  ,  ...,L  )•   There  is  often  great  latitude  in  choosing  the  covers, 
since  given  a  generalized  line,  L,  there  may  be  many  choices  for  L',  so 
that  L'  tesselates  the  plane.   For  the  generalized  line,  L  =  ((0,0), 
(0,  l),  (0,2),  (1,1),  (2,1) ),  if  the  search  for  covers  that  tesselate  the 
plane  is  arbitrarily  restricted  to  be  polyominoes,  and  covers  symmetric 
to  other  covers  are  ignored,  there  are  still  three  generalized  lines 
that  tesselate  the  plane  and  are  covers  for  L  (see  Figure  30) .   The 
rapid  growth  in  the  number  of  covers  which  are  able  to  tesselate  the 
plane  indicates  the  formal  evaluation  of  the  heuristic  given  earlier  may 
be  very  difficult. 

Conjecture:   Given  a  polyomino,  L,  of  length  N, 
for  which  there  is  no  valid  skewing  scheme  using 
N  memory  modules,  then  there  is  a  valid  skewing 
scheme  for  L  using  N+l  memory  modules  if  and  only 
if  there  is  a  cover,  L',  for  L,  of  length  N+l, 
L1  a  polyomino,  which  tesselates  the  plane. 
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Figure  30:   Covers  for  the  Generalized  Line 

((0,0),  (0,1),  (0,2),  (1,1),  (2,1))  which 
Tesselate  the  Plane. 
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4.3  Comments  on  Broader  Problems 

In  closing,  it  is  useful  to  relate  this  research  to  the 
construction  of  real  machines  and  to  the  work  of  others.   Computers 
similar  to  the  one  modeled  in  Figure  3  have  been  built  or  proposed. 
Many  researchers  have  realized  that  a  very  important  problem  is  the 
construction  of  memory-processor  connection  networks  [10, 14].   In  actual 
computations  it  is  necessary  to  align  the  data  so  that  the  first 
component  of  an  instance  always  appears  at  processor  i  ,  the  second 
component  of  the  instance  always  appears  at  processor  i  ,  etc.  An 
example  may  be  helpful.   Consider  the  generalized  line  L  =  ( (0, 0), (0, 1), 
(0,2),  (1,  l),  (2,0),  (2,1),  (2,2))  and  the  periodic  skewing  scheme  depicted 
in  Figure  13.   If  an  instance  is  fetched,  whose  designated  element  is 
stored  in  memory  zero,  then  the  data  is  already  aligned,  i.e.  the  element 
demanded  by  processor  zero  is  in  memory  module  zero,  the  element  demanded 
by  processor  one  is  in  memory  module  one,  etc.   For  an  instance  whose 
designated  element  is  stored  in  memory  module  one,  however,  the  data  from 
memory  module  one  needs  to  be  routed  to  processor  zero,  the  data  from 
memory  module  two  needs  to  be  routed  to  processor  one,  the  data  in  memory 
module  four  needs  to  be  routed  to  processor  two,  etc.   This  situation  can 
be  described  as  follows:   If  an  instance  of  L  is  fetched,  whose  designated 
element  is  stored  in  memory  module  one,  then  to  align  the  data  with  the 
processors,  the  memory-processor- connection  network  must  be  able  to  sort 
the  permutation  (12  4  0  5  6  3)-  The  phrase  "sort  the  permutation"  is 
appropriate  since  paths  must  be  established  from  processor  zero  to  memory 
module  one,  from  processor  one  to  memory  module  two,  etc.,  and  this  can 
be  thought  of  as  placing  (12^0563)  as  input  at  the  processor  side 
of  the  network  and  getting  as  output  (0123^56)  at  the  memory  side 
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of  the  network.  Using  this  terminology,  to  be  able  to  align  any 
instance  of  L,  using  the  skewing  scheme  of  Figure  13,  the  network  must 
be  able  to  sort  (0123^56),  (l  2  1|  0  5  6  )),  (2  1+  5  1  6  3  0), 
(3  0  1  6  2  h   5),  (k   5  6  2  3  0  1),  (5  6  3  ]+  0  l  2),  and  (6  3  0  5  1  2  *0  • 
In  general,  if  a  periodic  skewing  scheme  is  used,  derived  from  a 
tesselation  generated  by  replicating  parallelograms  constructed  as 
described  in  Section  h.l,    then,  in  order  to  align  the  data,  the 
memory-processor  connection  network  must  be  able  to  sort  N  permutations. 
Standard  fan- in  arguments  show  that  this  will  take  0(fog.  N)  time. 

The  results  of  Chapter  3  and  Section  k.l   are  very 
encouraging.   It  appears  that  in  the  most  important  practical 
cases  valid  linear  skewing  schemes  exist  if  any  valid  skewing  schemes 
exist.  When  linear  skewing  schemes  are  employed  the  memory-processor 
connection  network  Is  simpler.   If  it  is  unnecessary  for  the  first 
component  of  the  generalized  line  to  go  to  processor  zero,  but  if,  on 
the  other  hand,  it  is  sufficient  that  it  always  go  to  processor  i  ,  and 
similarly  for  the  other  components  of  the  generalized  line,  then  a 
network  capable  of  performing  arbitrary  shifts  is  adequate.   That  is, 
the  permutations  to  be  sorted  are  just  (0  12  ...  N-l),  (12  3  •••  N-l  0), 
(2  3^  ...  N-10  1),  ...,  (N-l  0  1  ...  N-2). 

If  only  one  generalized  line  is  required  for  a  computation,  then 
there  are  no  difficulties  created  by  always  sending  the  first  component  to 
processor  i  ,  the  second  component  to  processor  i  ,  etc.   If  several 
generalized  lines  are  used  by  an  algorithm,  however,  problems  may  develop. 

It  is  often  the  case,  as  in  matrix  multiplication,  that  the  algorithm  will 

th 
require  that  the  j   component  of  all  the  generalized  lines  used  be  sent 

to  the  same  processor,  i..  A  simple  shifting  network  may  no  longer  be 

J 
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adequate,  since  the  j^n  component  of  one  generalized  line  may  not  be 
sent  to  the  same  processor  as  the  j™  component  of  some  other  generalized 
line.   It  is  necessary  to  apply  a  corrective  transformation  after  (or 
before)  the  shifting  is  performed.  A  different  transformation  may  be 
required  for  each  generalized  line.   For  N,  a  power  of  two,  Lawrie's 
Q -network  [10]  performs  the  shifting  and  the  additional  transformations 
needed  simultaneously,  without  additional  time  delay  or  extra  hardware. 
Unfortunately,  as  the  results  presented  here  indicate,  when  only  the 
problem  of  finding  valid  skewing  schemes  is  considered,  a  power  of  two 
is  not  the  best  choice  for  N.   Taking  N  to  be  a  prime  number  gives 
consistently  better  results.   Taking  N  to  be  prime  has  two  major 
disadvantages,  however.   The  modular  arithmetic  is  much  slower  and  no 
networks  that  can  perform  as  well  as  the  Q-network  are  known. 

One  possible  way  of  constructing  such  a  memory-processor 
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connection  network  is  to  follow  the  shifting  network  by  a  Benes  network  [2] 

which  can  sort  any  permutation.   The  best  algorithm  for  setting  up 
the  Benes  network  is  due  to  Opferman  and  Tsao-Wu  [12] .   This  algorithm, 
which  takes  0(1%  N)  time,  is  too  slow  for  use  on  each  memory  cycle. 
However,  since  the  generalized  lines  used  by  a  program  are  fixed  at 
compile  time  the  way  in  which  the  network  needs  to  be  set  up  for  each 
generalized  line  used  by  the  program  can  be  calculated  once,  before  the 
program  is  run,  stored  in  a  memory,  and  then  read  out,  to  set  the  network 
up  rapidly,  when  needed.  Adding  a  Benes  network  to  the  memory-processor 
connection  network,  as  just  described,  will  create  additional  cost  and 
will  slow  down  the  machine  somewhat  since  the  data  will  have  to  pass  through 
more  gates.   This  problem,  and  particularly  incorporating  the  Benes  network 
in  with  the  shifting  network,  is  a  good  candidate  for  further  research. 
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